Cj£g) m 

S V O R 

Gesellschaft fiir Operations Research e.V. (GOR) 
Schweizerische Vereinigung fiir Operations Research 
Osterreichische Gesellschaft fur Operations Research 



U. Leopold-Wildburger 
F. Rendl • G. Wascher 

Editors 





uperano 
Research 
Proceeding 
2002 





Operations Research Proceedings 2002 



Selected Papers 

of the International Conference 
on Operations Research (SOR 2002) 

Klagenfurt, September 2-5, 2002 



Springer- Verlag Berlin Heidelberg GmbH 



U. Leopold-Wildburger 
F. Rendl • G. Wascher 

Editors 

Operations Research 
Proceedings 2002 

Selected Papers 

of the International Conference 
on Operations Research (SOR 2002) 

Klagenfurt, September 2-5, 2002 



With 120 Figures 
and 51 Tables 




Springer 




Professor Dr. Ulrike Leopold-Wildburger 
Universitat Graz 

Institut fiir Statistik und Operations Research 
UniversitatsstraBe 15/E3 
8010 Graz, Austria 

Professor Dr. Franz Rendl 

Universitat Klagenfurt 
Institut fur Mathematik 
9020 Klagenfurt, Austria 

Professor Dr. Gerhard Wascher 

Otto- von-Guericke-U niversitat Magdeburg 
Fakultat fiir Wirtschaftswissenschaften 
BWL VIII: Management Science 
Postfach 4120 

39016 Magdeburg, Germany 



ISBN 978-3-540-00387-8 ISBN 978-3-642-55537-4 (eBook) 

DOI 10.1007/978-3-642-55537-4 



Cataloging-in-Publication Data applied for 

A catalog record for this book is available from the Library of Congress. 

Bibliographic information published by Die Deutsche Bibliothek 

Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; detailed 
bibliographic data is available in the Internet at http://dnb.ddb.de. 

This work is subject to copyright. All rights are reserved, whether the whole or part of the 
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, 
recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data 
banks. Duplication of this publication or parts thereof is permitted only under the provisions 
of the German Copyright Law of September 9, 1965, in its current version, and permission 
for use must always be obtained from Springer- Verlag. Violations are liable for prosecution 
under the German Copyright Law. 



http://www.springer.de 
© Springer- Verlag Berlin Heidelberg 2003 

The use of general descriptive names, registered names, trademarks, etc. in this publication 
does not imply, even in the absence of a specific statement, that such names are exempt 
from the relevant protective laws and regulations and therefore free for general use. 

Cover design: Erich Kirchner, Heidelberg 

SPIN 10909454 42/3130-5 4 3 2 1 0 - Printed on acid-free paper 




Preface 



This volume contains selected papers presented at the International Conference on 
Operations Research SOR 2002 held at the University of Klagenfurt from Sep- 
tember 2 to September 5, 2002. 

The conference was organized under the auspices of the German, the Swiss and 
the Austrian Operations Research societies 

- Gesellschaft fur Operations Research e. V. (GOR) 

- Schweizerische Vereinigung fur Operations Research (SVOR) 

- Osterreichische Gesellschaft fur Operations Research (OGOR). 



After Vienna (1990), Berlin (1994) and Zurich (1998) this has been the fourth 
time that the three societies organized a joint conference. 

The conference was attended by more than 400 participants from countries all 
over the world which demonstrates the broad interest in all aspects of Operations 
Research. 

The scientific program of the conference consisted of 4 plenary lectures, 5 semi 
plenary lectures, and about 320 contributed papers which have been presented in 
16 sections. Due to the limited number of pages available for the proceedings vol- 
ume, the length of each article as well as the total number of contributions had to 
be restricted. 

The decision on the acceptance of papers for the proceedings has been made in 
close CO operation with the section chairmen and was based on their suggestions. 

We wish to express our sincere thanks to the chairmen for supporting our editorial 
work by refereeing the manuscripts and letting us have their advice. We also 
would like to thank Dr. Wemer Muller from Springer-Verlag for his support in 
publishing this proceedings volume so quickly. 



Klagenfurt, November 2002 

Ulrike LEOPOLD- WILDBURGER Franz RENDL Gerhard WASCHER 




ORGANIZING COMMITTEE: 



F. Rendl (Klagenfurt) 

I. Fischer (Klagenfurt) 

G. Gruber (Klagenfurt) 

A. Wiegele (Klagenfurt) 

B. Klinz (Graz) 

U. Leopold- Wildburger (Graz) 
G. Feichtinger (Wien) 



PROGRAM COMMITTEE: 

R.E. Burkard (Graz) 

E. Fragniere (Lausanne) 

K. Frauendorfer (St. Gallen) 
R. Hartl (Wien) 

W. Kursten (Jena) 

H.-J. Luethi (Zurich) 

F. Rendl (Klagenfurt) 

T. Spengler (Braunschweig) 

G. Wascher (Magdeburg) 




Sections and Chairs 



Section 1: Production, Logistics and Suppiy Chain Management 

S. Helber (Clausthal) 

W. Jammemegg (WU Wien) 

Section 2: Marketing and Data Anaiysis 

L. Hildebrandt (HU Berlin) 

U. Wagner (Uni Wien) 

Section 3: Transportation and Traffic 

S. Voss (TU Braunschweig) 

U. Zimmermann (TU Braimschweig) 

Section 4: Scheduling and Project Management 

P. Brucker (Osnabriick) 

E. Pesch (Siegen) 

Section 5: Teiecommunication und information Technology 

A. Taudes (WU Wien) 

R. Gismondi (Telekom) 

Section 6: Energy and Environment 

A. Haurie (Genf) 

A. Tuma (Augsburg) 

Section 7: Public Economy, Health, Agriculture, Education 

A. Stepan (TU Wien) 

L. Zadnik-Stim (Ljubljana) 

Section 8: Banking, Finance, Insurances, Risk Management 

A. Oehler (Bamberg) 

G. Pflug (Uni Wien) 




VIII 



Section 9: Continuous Optimization 

F. Jarre (Dusseldorf) 

J.-P. Vial (Genf) 

Section 10: Discrete and Combinatorial Optimization 

M. Jiinger (Koln) 

R. Mohring (TU Berlin) 

Section 11: Stochastic and Dynamic Programming 

W. Romisch (HU Berlin) 

R. Schultz (Duisburg) 

Section 12: Simulation 

P. Chamoni (Duisburg) 

Section 13: Control Theory, Systems Dynamics, Dynamic Games 

M. Schwaninger (St. Gallen) 

G. Tragler (TU Wien) 

Section 14: Game Theory, Auctioning and Bidding, Experimental 
Economics 

F. Bolle (Frankfurt Oder) 

H. W. Brachinger (Fribourg) 

Section 15: Econometrics, Statistics and Mathematical 
Economics 

W. Eichhom (Karlsruhe) 

M. Luptacik (WU Wien) 

Section 16: Fuzzy Logic, Multicriteria Decision Making, Decision 
Theory 

A. Scholl (Jena) 

B. Werners (Bochum) 




Contents 



Preface 

Plenary Talk and GOR-Awards 

The Role of Operations Research in Public Policy 1 

Jonathan P. Caulkins 

Risk-Return Optimization of the Bank Portfolio 14 

Ursula Theiler 

Resource-orientated Purchase Planning and Supplier 
Selection - Models and Algorithms for Supply Chain 



Optimization and E-Commerce 20 

Gabriele Reith-Ahlemeier 

A Combinatorial Approach to Orthogonal Placement 
Problems 26 

Gunnar W. Klau 

Assigning Frequencies in GSM Networks 33 

Andreas Eisenblatter 

Optimal Control of Methadone Treatment in 

Preventing Blood-Borne Disease 41 

Julia Almeder 



Section 1 : Production, Logistics and Supply Chain 
Production, Logistics and Suppiy Chain 

Strategien zur Rustzeitvermeidung in der 

Elektronikfertigung 47 

Claus-Biukard Bohnlein 

Optimale Belegung von StranggieBanlagen mittels 
2-dimensionaier Bin-Packing-Modelle 53 

Thomas Spengler, Oliver Seefried 

Short-term Capacity Planning in Manufacturing 

Companies with a Decentralized Organization 59 

Peter Letmathe 

Performance Anaiysis of Make to Stock Suppiy 

Chains Using Discrete-Time Queueing Models 65 

Sandeep Jain, N. R. Srinivasa Raghavan 




X 



A New Optimal Demand Forecast Model 

Joachim Althaler, Herbert Jodlbauer 



71 



A Multi-Product Batch-Available-to-Promise 

Model for Make-to Stock Manufacturing 77 

Richard Pibemik 

Scheduling of Rolling Ingots Production 83 

Christoph Schwindt, Norbert Trautmann 

Determination of Economic Production Quantity 

for a Multi-Stage Production System with Limited 

Storage Capacity 89 

U. Buscher, G. Lindner 

A Reverse Logistics Model with Integer Setup Numbers 95 

Rnut Richter, Imre Dobos 

Lotsizing in a Production System with Rework 

and Product Deterioration 102 

K. Inderfurth, G. Lindner, N.P. Rahaniotis 

Mathematical Programming Models for Strategic 

Supply Chain Planning and Design 108 

J. Kalcsics, M.T. Melo, S. Nickel 

Ein dynamisches Verhandlungsmodell des 

Supply Chain Management 114 

Eric Sucky 

Modeling the Interaction between Operational 

and Financial Decisions in the Inventory Pooling 

of Repairable Spare Parts Problem 120 

Hartanto Wong, Dirk Cattrysse, Dirk Van Oudheusden 



Section 2: Marketing and Data Anaiysis 

Die Schatzung von Markentreue, Nichtkauferanteil 

und Marktpotenzial aus Handelspaneidaten 127 

Heribert Reisinger, Udo Wagner, Matthias Schuster 

Section 3: Transportation and Traffic 



A Conjugate Direction Frank-Wolfe Method with 
Applications to the Traffic Assignment Problem 

Maria Daneva, Per Olov Lindberg 



133 




XI 



Online-Algorithmus zur Steuerung 

von Verkehrslichtsignalanlagen 139 

Klaus Ladner 

Optimal Sorting Machine Allocation in the 

Postal Distribution Network 144 

Jaroslav Janddek 

A Combined Approach to Solve the Pickup 

and Delivery Selection Problem 150 

Jom Schonberger, Herbert Kopfer, Dirk C. Mattfeld 

VRP with Interdependent Time Windows - 
A Case Study for the Austrian Red Cross 
Blood Program 156 

Karl Doemer, Manfred Gronalt, Richard F. Hartl, 

Marc Reimann, Kerstin Zisser 

Incident Management Based on Real Time Simulation 162 

Jiirgen Zajicek, Martin Linauer, Katja Schechtner 

Online-Dispatching of Automobile Service Units 168 

Martin Grotschel, Sven O. Krumke, Jorg Rambau, Luis M. Torres 

Multi-Class User Equilibria under Social 

Marginal Cost Pricing 174 

Leonid Engelson, Per Olov Lindberg, Maria Daneva 

School Bus Rooting and Scheduling Problem 180 

Michela Spada, Michel Bierlaire, Thomas M. Liebling 

Covering Population Areas by Railway Stops 187 

Anita Schobel, Michael Schroder 

Innovative Losungen im bimodalen Transport 

StraBe / BinnenstraBe 1 93 

Joachim R. Daduna, Johannes Schroter 

Optimal Routing of Snowplows - A Column 

Generation Approach 199 

Nima Golbaharan, Per Olov Lindberg, Maud Gothe Lundgren 

Savings Based Ants for Large-scale Vehicle 

Routing Problems 205 

Marc Reimann, Karl Doemer 




XII 



Section 4: Scheduling and Project Management 

Single Machine Scheduling Problems with 
Exponentially Start Time Dependent Job Processing 
Times 

Alexander Bachmann, Adam Janiak, Mikhail Y. Kovalyov 

Scheduling Problems with Optimal Due Interval 
Assignment Subject to Some Generalized Criteria 

Adam Janiak, Marcin Marek 

A New Exact Resource Allocation Model with 
Hard and Soft Resource Constraints 

Ferenc Kruzslicz 

Minimizing Total Weighted Tardiness on Parallel Batch 
Process Machines Using Genetic Algorithms 

Lars Monch, Hari Balasubramanian, John W. Fowler, 

Michele E. Pfimd 

Sorting with Line Storage Systems 

Thomas Epping, Winfried Hochstattler 

On Solvability of the Project Scheduling Problem with 
Accumulative Resources of an Arbitrary Sign 

Edward Gimadi, Sergey Sevastianov 



Section 5: Telecommunication and Information 
Technology 

Cost Optimized Layout of Fibre Optic 
Networks in the Access Net Domain 

Peter Bachhiesl, Gemot Paulus, Markus Prossegger 
Joachim Wemer, Herbert Stogner 

Ein Algorithmus zur sicheren elektronischen 
Stimmabgabe iiber das Internet 

Alexander Prosser, Robert Muller-Torok 

Section 6: Energy and Environment 

Accelerated MiLP-Strategies for the Optimal 
Operation Planning of Energy Supply System 

Peter Hacklander, Johannes F. Verstege 

On On-line Systems for Short-term 
Forecasting for Energy Systems 

Henrik Aalborg Nielsen, Torben Skov Nielsen, Henrik Madsen 



211 

217 

223 

229 

235 

241 

247 

253 

259 

265 




XIII 



Combining Bottom-up and Finance Modeiiing for 

Eiectricity Markets 272 

Christoph Weber 

Gestaitung von Stoffstrom-Netzwerken zum 
Produktrecycling 278 

Thomas Spengler, Grit Walther 

Ein Ansatz zur Bewertung von 

Remanufacturingstrategien 284 

Axel Tuma, Baptiste Lebreton 

Environmental Coordination of Suppiy Chain 

Networks Based on a Multi-Agent System 290 

Axel Tuma, Jurgen Friedl 

Decision Support for the National Implementation 
of Emission Reduction Measures by the Dynamic 
Mass Flow Optimisation Model ARGUS 296 

Jutta Geldermann, Nurten Avci, Stefan Wenzel, Otto Rentz 

Fuzzy Scheduling for the Dismantling of Complex 

Products 302 

Frank Schultmann, Otto Rentz 



Section 7: Public Economy, Health, Agriculture, 

Education 

Group Decision Making Versus Expert Opinion in the 
Multi-Objective Analysis of Ecosystem Management 309 

Lidija Zadnik Stim 

Section 8: Banking, Finance, Insurance, 

Risk Management 

Capital Market Efficiency - An Empirical Analysis 

of the Dividend Announcement Effect for the 

Austrian Stock Market 315 

Roland Mestel, Henryk Gurgul, Christoph Schleicher 

On Tail Index Estimation and Financial Risk 

Management Implications 321 

Niklas Wagner 

Project Risk Management by a Probabilistic 

Expert System 329 

Andre Ahuja, Wilhelm Rodder 




XIV 

Regulatory Impacts on Credit Portfolio Management 335 

Ursula Theiler, Vladimir Bugera, Alla Revenko, Stanislav Uryasev 

Verfahren zur Risikokapitalallokation im 

Eigenhandel von Banken 341 

Mario Strafiberger 

Section 9: Continuous Optimization 

Process Optimization via Conventional Factorial 
Designs and Simulated Annealing on the Path of 
Steepest Ascent for a CSTR 347 

Pongchamm Luangpaiboon 

Optimization on Directionaily Convex Sets 353 

Vladimir Naidenko 

Section 10: Discrete and Combinatorial 
Optimization 



Meta-Heuristiken in Virtuellen Lernumgebungen 359 

Torsten Reiners, Imke Sassen, Stefan Vo6 

An Evoiutionary Aigorithm for Bayesian Network 
Trianguiation 365 

Tomasz Lukaszewski 

Approximation Aigorithms for the k-center 

Problem: An Experimental Evaluation 371 

Jurij Mihelic, Borut Robic 

MaxFIow-MinCut Duality for a Paint Shop Problem 377 

Thomas Epping, Winfried Hochstattler, Marco E. Liibbecke 

From Edge Decomposition Formuiae 

to Composition Algorithms 383 

Andre Ponitz 

The Compiexity of Some Probiems 

on Maximal Independent Sets in Graphs 389 

Igor Zverovich, Yury Orlovich 



Section 11: Stochastic and Dynamic Programming 



Testing Solution Quality in Stochastic Programs 

David P. Morton 



395 




XV 

Scenario Updating Method for Stochastic 

Mixed-integer Programming Problems 401 

Guglielmo Lulli, Suvrajeet Sen 

Pricing of Multidimensional Resources in Revenue 
Management 407 

Jens Feller 

A Note on Quantitative Stability and Empirical 

Estimates in Stochastic Programming 413 

Vlasta Kankova, Michal Honda 

Splitting and Localization of the Epi-Topology 

Combined with Randomness 419 

Petr Lachout 

Section 12: Simulation 

Standards fiir Modellierung und Simulation 425 

Claus-Burkard Bohnlein 

Section 13:Control Theory, System Dynamics, 

Dynamic Games 

System Dynamics(SD) - An Approach within 

Corporate Plannning 431 

Peter Bradl 

Optimal Decision Rules in a Monetary Union 437 

Doris A. Behrens, Reinhard Neck 

Impact of Feedback Loop on Group Decision Process 

when Applying System Dynamics Simulators 446 

Andrej Skraba, Miroljub Kljajic 

Section 14: Game Theory, Auctioning and Bidding, 
Experimental Economics 

The Management Game SiNTO-Market - 

Report on Some Recent Experiments 453 

Otwin Becker, Tanja Feit, Ulrike Leopold-Wildburger, 

Susanne Lind-Braucher, Jorg Schiitze, Reinhard Selten 

Bounds & Likelihood Procedure Revisited 459 

Otwin Becker, Johannes Leitner, Ulrike Leopold-Wildburger, 

Jorg H. Schiitze 




XVI 

On the Allocation of Excesses of Resources in 

Linear Production Problems 465 

F. R. Femtodez, G. Fiestras, I. Garda- Jxirado, J. Puerto 

Section 15: Econometrics, Statistics and 
Mathematicai Economics 



Simulation eines C02-Zertifilcatenhandels und 
algorithmische Optimierung von Investitionen 471 

Silja Meyer-Nieberg, Stefan Pickl 

Indirect Expenditure Functions and Shephard's Lemma 474 

S. Fuchs-Seliger 

Bayesian Estimation of the Heston Stochastic 

Volatility Model 480 

Sylvia Fruhwirt-Schnatter, Leopold Sogner 

Forecasting with Leading Economic Indicators - 
A Neural Network Approach 486 

Timotej Jagric 

Estimating Multivariate Conditional Distributions - 
An Application to the Truck Sales Forecast 492 

Eric A. Stutzle, Tomas Hrycej 

Application of Techniques of Functional Data 

Analysis to Spectroscopic Data 498 

Vera Hofer 

Integrating Exchange Rate Theory in Data Mining 504 

Bemd Brandi 

The Ability of Artificial Neural Networks to Exploit 
Non-Linearities by Data Mining Models Compared 
to Statistical Methods 510 

Lutz Beinsen, Bemd Brandi 



Section 16: Fuzzy Logic, Muiticriteria Decision 
Making, Decision Theory 

Preference Measurement with Conjoint Analysis 

and AHP: An Empirical Comparison 517 

Roland Helm, Laura Manthey, Armin Scholl, Michael Steiner 

Further Development of MADM-Approaches in 

China and in Germany 525 

Jutta Geldermann, Kejing Zhang, Otto Rentz 




Wissensrevision in einer MaxEnt/MinREnt-Umgebung 

Elmar Reucher, Wilhelm Rodder 

Innovation, Operations Research & Decision 
Support in the Military 

Heiner Micko 

Von der Pradikatenlogik zur unternehmerischen 
Entscheidungsunterstiitzung 

Friedhelm Kulmann, Wilhelm Rodder 



XVII 

533 

539 

545 




The Role of Operations Research in Public Policy 

Jonathan P. Caulkins 
Carnegie Mellon University 

H. John Heinz III School of Public Policy and Management 
5000 Forbes Ave. 

Pittsburgh, PA 15213-3890 
caulkins@cmu.edu 

Abstract 

This paper reflects on the role Operations Research (OR) has and has not played in 
public policy. It is based on a plenary talk delivered on September 4, 2002 to the 
International Conference on OR held in Klagenfurt, Austria. It concludes that OR 
clearly makes many important contributions to public policy making, but relatively 
speaking OR makes the smallest contribution to strategic issues for which 
authority is distributed and for which the “physics” of the underlying system are 
not central. One reason is that there are so many layers of people between the 
typical Operations Researcher and the key policy makers. OR models and model 
results are not effective from such a distance because they are distorted and diluted 
each time they are translated. An alternative path of influence would be to modify 
OR curricula so that people with OR training can take jobs “closer” to the key 
policy makers. 



1. Introduction: The Glass if Half-Full and Half-Empty 

The goal of this paper is to reflect on and draw lessons from the role 
Operations Research (OR) has played in public policy. I will not review a series 
of models. Perusing Pollack et al.’s (1994) very fine volume is a more efficient 
way to do that. Rather, I will try to characterize what OR has and has not 
contributed, and why. 

My principle qualification for undertaking this is that I am Professor of 
Operations Research and Public Policy at Carnegie Mellon University, one of the 
very few academics with both phrases in my job title. My principle limitation is 
that my experience base and perspectives primarily concern policy in the US. I 
will tell a story that I believe is true with respect to the US, and allow the reader to 
determine whether it applies in Austria or elsewhere outside the US. 

In organizing these thoughts, it is useful to refer to the metaphor of a 
glass that can be viewed as half-full or half-empty, depending on what the observer 
focuses on. In that vein, I begin by pointing out that the glass of OR in the 
furtherance of public policy is clearly half-full and half-empty. 



Given that this paper is addressed to an OR audience, there is limited 
value in belaboring the many successful applications of OR to public policy. That 




2 



would be preaching to the choir. The profession can be justifiably proud of its 
contributions during the Second World War and the Cold War. Public sector 
applications enjoy a significant “market share” among articles in Interfaces, the 
premier journal for OR applications, and among Franz Edelman Award recipients 
and finalists, an award described as the “Super Bowl” of OR (Homer, 2002). 

Linear programming is in some sense the methodological heart of OR, 
and two of the classic applications used in textbooks to motivate linear 
programming are public policy applications: the “diet problem” that played a 
central role in defining the poverty level and the “school busing problem” that 
emerged fi*om court orders to desegregate schools in the US. 

Rather, for this audience, it is more important to make the case that there 
is a portion of the glass that is half empty. That case can be made by looking at 
our conferences, journals, professional society sections, and courses. 

The leading North American research-oriented OR conference is 
INFORMS. Examining the program for this fall’s conference is instmctive. There 
are certainly some talks applying OR to public policy problems. Indeed, I will be 
giving one in a perennially popular session organized by Arnold Barnett entitled 
“Threats to Life and Limb”. More importantly, two prominent keynotes address 
“Modeling Bioterror Response Logistics” and “Restmcturing Electricity Markets”. 
However, the proportion of research talks that intersect public policy is modest, 
and the vast majority are concentrated in select domains. Of 21 invited session 
“tracks”, three address topics that intersect public policy in important ways 
(Aviation, Healthcare, and Electricity Markets & Energy Modeling). Similarly for 
four of the 25 sponsored session tracks (Energy/Natural Resources/Environment, 
Health Applications, Railroad Applications, and Location Analysis). And clearly 
not all talks within tracks that intersect public policy actually address public 
policy. For example, typically most INFORMS presentations on railroad 
applications strive to improve the operational efficiency and profitability of 
railroad companies, rather than addressing how the government should regulate 
railroads. 

One might expect the premier research conference to focus on methods 
development, and methods are generally not specific to either public policy or 
business applications. Perhaps the practitioner conferences are more focused on 
public policy? In fact, however, that is not the case. In North America the premier 
OR practitioners’ conference is held each May. At the May 2001 Conference, 
none of the 5 tracks or 41 speakers identified on the conference web page were 
specific to public policy and management. 

Another possibility is that public policy applications are addressed 
predominantly in special purpose OR conferences. However, of the 83 upcoming 
conferences listed on the INFORMS web site on August 28, 2002, only two were 
clearly about public policy (the 23^^ Army Science conference and the 
International Conference on OR for Development) and two were on industries that 





3 



are heavily regulated (2"^ International Conference on Freight Transportation 
Systems and the International Conference on Telecommunications). 

In many respects, the present conference stands out. Three of four 
plenary talks pertain to public policy, and five of 15 sections have significant 
intersection with public policy. They are Energy and Environment; Public 
Economy, Health, Agriculture, and Education; Telecommunications and IT; 
Transportation and Traffic; and Control Theory, Systems Dynamics, and Dynamic 
Games. (The last is defined by its methodology, but many of the applications 
pertain to public policy.) 

The story with OR journals is similar. Operations Research has four 
departments that address specific domains that intersect public policy 
(Environment, Energy, and Natural Resources; Military; Telecommunications; and 
Transportation), and Interfaces has two (Healthcare and Military). All three major 
North American journals have departments with names like “Public Sector 
Applications” (Management Science), “Policy Modeling and Public Sector OR” 
(Operations Research), and “Public Sector Department” (Interfaces). However, as 
an area or associate editor for the first two, I can attest that the volume of 
submissions is not high. The European Journal of Operational Research does not 
have any such dedicated department, although it has had special issues, e.g., one 
on “Optimizing Public Policy” in 1992. 

One could argue that most of the publishing occurs in field or specialty 
journals, but none of them is focused on public policy generally. Conversely most 
journals that focus on public policy (e.g., the Journal of Policy Analysis and 
Management, Policy Sciences, etc.) are not dominated by articles using OR 
methods. Socio-Economic Planning Sciences is one of the few journals with a 
broad interest in public policy for which a substantial share of the articles “look 
like” OR articles. 

The same story plays out in the INFORMS subdivisions which are called 
“sections” and “societies”. The two sections that focus explicitly on public policy 
(Public Programs and Processes and Social Science Applications) are among 
INFORMS’ smallest. Seven other sections and a “society” intersect specific 
public policy domains (Aviation; Energy, Natural Resources, and the 
Environment; Health; Location Analysis; Railroad Applications; 
Telecommunications; Transportation; and the Military Applications society). 

I have not made a thorough review of courses at the intersection of OR 
and public policy, but in general few public policy schools offer, let alone require, 
OR/MS courses and few OR programs offer, let alone require, courses in public 
policy. Indeed, as far as I know there are no introductory textbooks on OR and 
public policy in print. I do have the privilege of teaching a required core course in 
Management Science at Carnegie Mellon’s policy school (Caulkins, 1999), but it 
is almost one of a kind and I am forced to use a standard Management Science 
textbook geared toward MBA students. 





4 



2. Which Half of the OR and Public Policy Glass is Full? 

This review of conferences, journals, sections, and courses not only 
underscores that the glass of OR and public policy is half-empty as well as half- 
full, but it also indicates what part is full and what part is empty. Specifically, it 
suggests distinguishing five categories of roles OR plays in public policy. 

Roles OR Plays in Public Policy 

(1) OR for organizations and management generally that applies to public 

problems 

(2) OR for “regulated industries” 

(3) OR for public resource management 

(4) OR applied to government-provided services 

(5) OR applied to government policy making 

The first category addresses those aspects of government management 
that strongly parallel operations in the for profit or non-profit sector, such as using 
supply chain management for blood banks or vehicle routing problem algorithms 
to manage dial-a-ride services. Roughly speaking, OR is about as central to the 
management of government activities as it is to the management of typical non- 
governmental activities. One can write an interesting parallel paper on that topic, 
but I think the conclusion is again, half-full/half-empty. OR is clearly important to 
many businesses, but it is sobering to note that MBA programs in the US no longer 
need to teach OR/Management Science as a core topic, and most exercise their 
right not to do so. 

The second category is probably the largest in terms of journal articles, 
conference presentations, and the like. These are analyses of industries that are so 
highly regulated that government has a vested interest in the operation and 
performance of those industries that goes beyond its interest in the vitality of the 
economy generally. Principal among these industries are aviation, 
energy/electricity, healthcare, telecommunications, and transportation. Agriculture 
is another such example at this conference. 

Within these regulated industries, some OR work primarily helps 
companies, e.g., yield management for airlines. Other work is very directly related 
to policy: e.g., modeling how far apart planes should be spaced to yield an 
acceptable risk of collision. Some is intermediate, relevant to both the firms and 
the regulating bodies (e.g., modeling of ground holds). 

The third category pertains to the management of publicly owned 
environmental and natural resources such as forests, fisheries, and water. These 
include forestry planting and logging policies, setting harvest quotas for fish, 
managing irrigation projects, allocating water rights, and multi-objective planning 
of entire river basins as at the Tennessee Valley Authority. 




5 



The fourth category pertains to services provided by the government. 
Larson and Odoni’s (1981) text on Urban Operations Research is a paradigmatic 
example. It describes, among other things, the “hypercube queuing model” for 
priority queuing that underpins 911 emergency dispatch systems throughout the 
US. 



Healthcare applications belong in this category as well as among the 
regulated industries even in the US, not just in countries with nationalized 
healthcare. The military and the Veteran’s Administration directly operate many 
healthcare facilities that generate important and interesting operational problems. 

At least in the US, military applications more generally are an extremely 
important special case of a government “service” whose provision is guided by OR 
analysis. Indeed, the origins of OR as a profession are in military applications, and 
the US military may be the world’s largest consumer of OR analysis. 

The fifth category is the use of OR to guide government policy in the 
narrow sense of the word “policy.” At some level it is hard to distinguish what is 
policy and what is management or implementation. If an OR model is used to 
determine that blood inventories in large hospitals should use a first-in-first-out 
inventory policy, whereas small hospitals should use last-in-first-out (Pierskalla, 
2002), then in some sense OR is directly affecting policy. But the “narrow” 
definition of policy I have in mind for this fifth category is analysis that guides the 
writing of laws, regulations, or court decisions that affect the behavior of agents 
who are not themselves government employees or direct contractors (such as 
logging firms hired to cut on Forest Service land). 

For example, David Paltiel’s doctoral dissertation addressed the Food and 
Drug Administration’s drug approval process as a sequential decision analysis 
(Paltiel and Kaplan, 1993). Ed Kaplan (1994, 1995) modeled the effect of syringe 
exchange programs (SEPs) on the spread of HIV/AIDS with the goal of informing 
policy decisions concerning the legality and ftmding of SEPs. I along with 
colleagues at RAND, Carnegie Mellon, and the Technical University of Vienna 
have developed OR models of the spread and control of drugs, crime, and violence 
that can guide decisions concerning sentencing of repeat criminal offenders and 
the allocation of resources to various types of drug control operations (cf, 
Caulkins, 2000). Julia Almeder’s award winning thesis is another example (Balta, 
2002 ). 



Although there are examples of OR analyses of this sort, to me what is 
striking is how few there are relative to the number of important and interesting 
policy problems. Furthermore, at least in the areas with which I am most familiar, 
these models do not always have a decisive influence. I can cite instances in which 
the analysis made a concrete difference, but it is even easier to cite examples in 
which politics interfered, or the concerns of some special interest group prevailed 
over what was in the common good, or the analysis was simply deemed to be too 
complicated or too “academic” to weigh heavily in the policy decision process. 





6 



This pessimism about the magnitude of OR’s impact on public policy 
making is enhanced if I approach the question in the opposite direction. If I make 
a list of major US policy actions - other than those involving the military - there 
are few for which OR analysis played a key role, or at least a role that was 
sufficiently visible that a casual observer such as myself is aware of that role. (See 
Table below.) In contrast the President’s Council of Economic Advisors weighs in 
on most important policy issues. The exclusion of military policy and decisions is 
important. OR analysis does figure prominently in strategic decisions concerning 
defense, ranging fi-om the cancellation of major weapons system to decisions about 
preparation and strategy for war. 

Recent Major US Domestic Policy Actions 
Welfare reform 

Bush tax cut (e.g., of estate tax) 

Creation of the Department of Homeland Security 
Accountability movement in K-12 education 
Deregulation of electricity markets 

Laissez-faire antitrust policy, especially in telecommunications 
Aid to major air carriers after September 1 1**^ 

“Competitive” approach to attracting firms for local economic 
development 

Stem cell research compromise 
Extending public funding of Amtrak 

Recent Major US International, Non-Military Policy Actions 
IMF intervention in Brazil and Uruguay but not Argentina 
NAFTA and other “fast track” trade negotiations 
Plan Colombia 

Not following the Kyoto agreement 

One concise summary of these observations concerning where OR does 
and does not play a major role in public policy is the following. OR plays an 
important role if any one of three conditions pertain: (1) The issues pertain to 
tactical management or are at the implementation level. (2) The “physics” of the 
system are complex and central. Or, (3) Decision making is centralized in a 
command and control hierarchy. Where OR plays a secondary role is in strategic 
issues with distributed authority in general domains. 

The distinction between domains in which the “physics” of the system are 
complex and central vs. other domains is meant to recognize and explain OR’s 
prominent role in aviation, transportation, energy, telecommunications, etc. Few 
people believe one can think intelligently about appropriate separation distances 
for aircraft or the best way to manage ground hold delays without some 
mathematical model. The use of OR or some other quantitative paradigm is 
central and unavoidable in a way that is not true in welfare policy, education 
policy, or equal employment law. 





7 



The distinction between centralized decision-making and distributed 
authority is meant to recognize and explain why OR plays a more prominent role 
in defense policy than in policy toward homelessness. Defense policy is 
centralized in the Department of Defense. In contrast, homelessness is a concern 
of multiple federal agencies and importantly of the 50 states, roughly 3,000 
counties, and tens of thousands of municipalities in the US. Likewise, the 
Department of Defense is hierarchical whereas Congress and other legislative 
bodies are not. It is not literally true, as is sometimes asserted, that all one has to 
do is convince the top general of the merit of an idea and implementation will take 
care of itself, but that is closer to being true in military policy than in some other 
domains. 

3. Why is the Other Half Empty? 

A natural question to ask is why doesn’t OR play a more central role in 
strategic issues with distributed authority in general domains? Many of the relevant 
factors are not specific to OR, but pertain to all model-based sciences. For 
example, it is conventional wisdom that one can lie with statistics. (Recall the 
axiom that there are three kinds of lies: “Lies, damned lies, and statistics.”) The 
uninitiated may not (and in this context perhaps should not) distinguish numbers 
produced by OR models and numbers produced by statistical analysis. 

In many policy debates, there are diverse stakeholders with competing 
interests. Ideally in the scientific world, debates emerge when people disagree 
because they have different understandings and different interpretations but the 
same objective (to seek the “truth”). In the political world, two parties with 
completely concordant understanding of the implications of the policy may debate 
viciously because they have different objectives. One Senator may represent a 
small urban state; another may represent a rural state in the Plains. They may 
disagree about what federal gun control policy should be even if they have exactly 
the same understanding of how that policy would affect crime and access to guns 
for sport and hunting in the nation as a whole. 

Stakeholders typically are not interested in finding the objective truth. 
They are advocates of a particular position or policy, and they seek to “spin” 
scientific evidence to their benefit. (Consider Otto Rentz’ observation at this 
conference (2002) that advocacy in the context of environmental policy is “The art 
of making your neighbor believe you have done more than you did so he will do 
more of what you didn’t do.”) In particular, advocates may cite selectively or 
otherwise distort the results of OR studies and other scientific analyses in order to 
serve their interests. Recognizing this, scientific evidence cited by stakeholders 
and advocates is viewed skeptically by other parties. Since the democratic process 
is fundamentally about stakeholders advocating their positions in public forums, 
far more time, money, and resources are invested in advocacy-oriented research 
and reporting of research than is invested by people without a stake in the issue. 
Hence, most citing of scientific evidence relevant to policy discussions is not 





8 



objective. Knowing that, people greet with skepticism even the minority of studies 
that do strive for objectivity. 

Exacerbating this problem is the fact that models do not stand up well to 
cross-examination. It is easy to criticize a model and to do so in a way that 
imdermines its credibility with people who are not familiar with the concept and 
merits of mathematical modeling. By definition models are simplification of 
realities, and one cannot scientifically prove the absence of something, so the 
analyst has a hard time responding to questioning along the following lines. 
“Professor Smith: Does your model simplify the issue at hand?” The only honest 
answer is yes, models simplify. Then, “Professor Smith: Can you prove to the 
committee that this simplification absolutely has no effect on your results?” The 
only honest answer to this is, “No I cannot prove that because the simplification 
does in fact affect the results” — even if in the analyst’s judgment the effects are 
second-order and do not compromise the fundamental conclusions. The next 
witness questioned may be an individual who can describe in poignant detail how 
his or her life was devastated by the policy in question, and then the non-scientists 
are left to decide whether to trust the colorless judgment of an egg-headed analyst 
or the heart-wrenching story of one individual’s eyewitness account. In such 
circumstances, anecdotes often trump models. 

As an aside, I do not think that this is always bad. Anecdotes and 
individual accounts can highlight particular issues, and some issues are not 
fundamentally amenable to mathematical analysis. 

None of these observations are particularly novel, but I can use a bit of 
OR, specifically a network diagram, to capture their essence in a picture that I 
think may be novel. It makes reference to the 1993 movie “Six Degrees of 
Separation,” based on the John Guare play. The movie’s premise is that everyone 
in the world is connected by no more than six links of the sort, “I know someone 
who knows someone who knows someone ... who knows you.” A common 
version involves showing that every movie actor is connected to Kevin Bacon by a 
chain of shared movies “Person in question was in a movie with A who was in a 
movie with B . . . who was in a movie with Kevin Bacon” with no more than six 
links in the chain. Hence, the idiom that no one - even OR methodologists - is 
more than six steps removed fi*om Kevin Bacon. 

Only somewhat facetiously, I assert that Operations Researchers and 
strategic public policy makers are a counter-example, at least if we define a link as 
indicating sufficient personal contact to pass along the sort of sophisticated 
thinking and ideas that emanate from OR models. The figure below illustrates the 
links between typical OR methodologists and the policy makers who make 
strategic decisions. You will note it has seven links. Hence, I claim that senior 
policy makers who “bring home the pork” are more distant from OR 
methodologists than is Kevin Bacon. This diagram is influenced by what I have 
observed in the area of drug and crime policy, but I submit that the gist of the 




9 



diagram probably holds for welfare policy, education policy, homelessness policy, 
and many other important public policy domains. 



Figure: The Senior Policy Makers Who “Bring Home the Pork” are More Distant 
from OR Methodologists than Is Kevin Bacon 



CRNfethods & AlgprifimnB Researdiets 
.Applied CRReseardias 



Mcy^nentedacade^ CR 



Miti^iisdpJiriary pcli(y^ academe cmmiity 



A3derri(s t^toixrhacadem 



Buneaucrats 



Staffers 



RMc/fvfedia 



I 

Qwicnpdls 



I I 

Senior Pditical .Appeinbees ardla^vrial^ 



I do not want to subject this diagram to too literal an interpretation. I 
personally straddle levels two and three in this diagram but have myself briefed 
senior policy makers and given testimony before Congress. However, most 
people, myself included, spend most of their time interacting with people who are 
at most one level away in this chain, and that creates severe communications 
problems between one end of the chain and the other. 

Recall the children’s game of “telephone”. A group of children sit in a 
circle. One whispers something into the ear of the child next to him or her. That 
child in turn whispers what he or she hears to the next child, and so on around the 





10 



circle. By the time the message gets back to the original speaker, it has usually 
been distorted beyond recognition. Sometimes distortion of that magnitude seems 
to happen as ideas move up or down this multi-layered chain of people separating 
the OR community from senior policy makers. 

When I step back and recognize how many layers there are in this chain, 
it does not surprise me that OR models are not driving strategic policy making 
outside the select domains where the “physics” of the system are complex and 
central and/or there is centralized decision making. (The latter allows analysts to 
brief directly key decision makers who can then act more or less unilaterally, as in 
some aspects of military policy.) 

4. How Can We Fill the Other Half? 

The final question is what might we do to enhance the role OR plays in 
strategic public policy. Perhaps one should first ask whether that would be 
desirable. I believe the answer is yes. I think the OR perspective offers a valuable 
way to integrate complex and competing considerations. Since I’m addressing an 
OR audience, I will simply assume that at least some of you agree. 

So how might one enhance the role of OR in strategic public policy 
making? As suggested, I do not think it is reasonable to expect models to move all 
the way down this chain by themselves. Rather, I think we have to think of ways 
of moving people up or down this chain to reduce the. number of links between OR 
methods and senior policy makers. Conceptually I can think of at least four ways 
of doing that. 

In principle the most direct approach would be to give executive 
education in OR to senior executives and lawmakers. Practically speaking, that 
will not work. Most people who have ascended to such pinnacles of power have 
long since forgotten their mathematics. (Those that remember math are probably 
working in areas like environmental policy vdiere OR is already affecting decisions 
at the strategic level.) Furthermore, such senior people are very busy. They would 
rather attend executive education tailored to their policy domain, not study a 
methodology. 

A second approach would recognize that most leaders have an advanced 
professional degree. If every such curriculum required one course in OR, that 
might stimulate understanding of and demand for OR analysis by those people 
after they move up the career ladder. However, we have already tried this, and it 
did not work. Business schools in the US used to require a core course in OR, but 
once accreditation requirements were relaxed and they no longer had to do so, 
most schools moved decisively away from it because the OR community could not 
deliver a course that was sufficiently valuable. Among MPA/MPP programs in the 
US, I think only CMU requires Management Science, and only a few other schools 
(e.g., the University of Michigan) even offer such classes in house. If this 
approach did not work for MBA’s vis a vis careers in business, it is unlikely to 





11 



work for MBA’s vis a vis government policy, let alone lawyers and government 
policy. 



The third possibility is similar, but one step fiuther back in the education 
process. One could require an OR course as part of the general education 
requirements for a first university degree. Every discipline could probably make a 
parallel case, but the case for OR should not be dismissed. In principle, a general 
or liberal arts education’s main value comes from teaching critical thinking, but the 
capacity to deliver on that promise has been eroded in many universities, where 
class sizes are very large and students pick courses from a cafeteria menu of 
electives (vs. a more regimented and cumulative classic liberal arts education). 
Learning OR and mathematical modeling may be a more effective and more 
appreciated way of teaching critical quantitative thinking than the current default, 
which is a course in calculus. One could even make a case for teaching 
mathematical modeling earlier, in secondary school. However, making OR a 
core subject in liberal studies is not likely to happen and, at any rate, is not 
something the OR community can decide to do unilaterally. 

In contrast the fourth possibility, and the one that I want to elaborate on, 
is something that OR faculty could implement on their own because it pertains to 
graduate education in OR. I have in mind education at the masters level because a 
Ph.D. is preparation for academic life, not high-level leadership in the public or 
private sector. OR masters programs vary across universities, influenced by 
factors such as whether they sit in a business school or an engineering school, but 
they have a lot in common. In particular, I do not know of any that were designed 
specifically with the goal of maximizing how far along this ladder students would 
get in their career. So it is interesting to ask: how might we change some of our 
OR masters programs if the stated goal were to push graduates of these programs 
as far as possible along the trajectory toward strategic leadership, rather than to 
maximize success in a technical position in the first few years post-graduation? 

I do not know what the answer is, but to be provocative I’ll suggest the 
following five possibilities. 

♦Provide a broader technical training, even at the expense of some depth. E.g., 
students might be better off taking an extra course each in statistics, 
negotiations, and GIS, and only three not six courses in optimization. 
♦Emphasize applications and problem structuring not algorithms, where the 
applications are real, not textbook problems. E.g., require project courses for 
real (paying) clients. 

♦Stress communication skills, including how to communicate intuition and insight, 
not just technical descriptions, and how to communicate to people who are not 
themselves trained in OR (in contrast, say, to learning how to write papers for 
OR journals). 

♦Recruit a different kind of applicant. Nerds with perfect standardized test scores 
may not be successful in moving down this chain, even if they get a 4.0 GPA 
in the program. Admissions criteria might need to weight more than they now 
do prior experiences such as being captain of a sports team, president of one’s 





12 



high school graduating class, or having started a company or non-profit 
organization. 

♦Consider changing the degree title to something like a “Masters in Strategic 
Analysis” rather than a Masters in OR. The title “Operations Research” has 
very limited name recognition, not only relative to Coca-Cola but also relative 
to competing disciplines. 

Again, I offer these ideas to be provocative and stimulate discussion, not 
with any pretense that this is the “right” answer. Other possibilities are worth 
considering, such as joint degrees in OR and a “soft” discipline or OR and a 
substantive domain such as Chemical Engineering for environmental policy or an 
M.D. for health policy. 

5. Summary 

OR has made important contributions to public policy, but a clear-eyed 
review suggests that its role has been greater in some areas than in others. 
Relatively speaking, OR’s role has been smaller in the analysis of strategic issues 
with distributed authority in general domains. A major limitation is that many 
layers of people now separate the typical Operations Researcher from senior policy 
makers and lawmakers, and for various reasons it is hard to push models far down 
that chain. If one wished to expand OR’s role in strategic policy making beyond 
domains (such as defense policy) where it is already prominent, one strategy would 
be to create OR masters programs that differ in important ways fi*om the typical 
program of today. 





13 



References 



1. Balta, Julia, 2002: Optimal control of methadone treatment in preventing 
blood-borne disease. Masters Thesis, Institute for Econometrics, Operations 
Research, and Systems Theory. Vienna University of Technology 

2. Caulkins, Jonathan P., 1999: The revolution in management science 

instruction: Implications for teaching public affairs students. The Journal of 
Public Affairs Education. 5:107-1 17 

3. Caulkins, Jonathan P., 2000: Measurement and analysis of drug problems and 
drug control efforts. In: Duffee D, McDowall D, Green Mazerolle K, 
Mastrofski S (eds.) Criminal Justice 2000: Volume 4, Measurement and 
Analysis of Crime and Justice, USGPO, Washington DC pp 391-449 

4. Homer, Peter, 2002: And the winner Is . . . OR/MS Today, 29 

5. Kaplan, Edward H., 1994: A method for evaluating needle exchange programs. 
Statistics in Medicine, 13:2179-2187 

6. Kaplan, Edward H., 1995: Probability models of needle exchange. Operations 
Research. 43:558-569 

7. Larson, Richard C, Amadeo, Odoni, 1981: Urban Operations Research. 
Prentice-Hall, Englewood, New Jersey 

8. Paltiel, A.D., Kaplan, E., 1993: The epidemiological and economic 

consequences of AIDS clinical trials. Journal of Acquired Immune 
Deficiency Syndromes. 6:179-190 

9. Pierskalla, William, 2002: Blood bank inventory control - a supply chain 
approach. Plenary talk at the International Conference on Operations 
Research, Klagenfurt, Austria, September 3, 2002 

10. Pollock, S.M., Rothkopf, M.H., Barnett, A.I.(eds.) 1994. Operations 
Research and the Public Sector, Handbooks in Operations Research and 
Management Science, Volume 6, North-Holland, New York 

1 l.Rentz, Otto, 2002: Contributions of Operations Research to Environmental 
Policy Plenary talk at the International Conference on Operations Research, 
Klagenfurt, Austria, September 2, 2002 





Risk-Return Optimization of the Bank Portfoiio 



Ursula Theiler 

Risk Training, Carl-Zeiss-Str. 11, D-83052 Bruckmuehl, Germany, 
mailto:theiler@risk-training.org. 



Abstract 

In an intensifying competition banks are forced to develop and implement en- 
terprise wide integrated risk-return management systems. Financial risks have to 
be limited and managed from a bank wide portfolio perspective. Risk management 
rules must be accomplished from internal and regulatory points of view. Expected 
returns need to be maximized subject to these constraints, leading to a generalized 
portfolio optimization problem under different capital limits. 

We give a survey on a risk-return optimization model for the bank portfolio that 
maximizes the expected returns to the planning horizon with respect to internal 
and regulatory loss risk constraints. We derive consistent planning information 
that ensures efficient return targets and maximal capital use of the economic and 
the regulatory capital. The impact of the optimization is shown by an application 
example. 



1 Introduction 

In an intensifying competition banks are forced to develop and implement en- 
terprise wide integrated risk-return management systems. Financial risks have to 
be limited and managed from a bank wide portfolio perspective. Risk management 
rules must be accomplished from internal and regulatory points of view. Expected 
returns need to be maximized subject to these constraints, leading to a generalized 
portfolio optimization problem under different capital limits. 

In chapter 2 we give a survey on a risk-return optimization model that maxi- 
mizes the expected returns of the bank portfolio to the planning horizon subject to 
internal and regulatory loss risk ceilings. The internal risk constraint is based on 
the new risk measure of Conditional Value at Risk (CVaR), that has been proved 
to be appropriate for measuring bank wide loss risk [4,5]. We solve the optimiza- 
tion problem by a CVaR-optimization algorithm by Rockafellar/Uryasev [4,5]. 
The regulatory capital restrictions represent the ‘Basle Rules’ of risk limitation 
[1,2]. We derive consistent planning information from the optimum solution that 
ensures efficient return targets and a maximal use of the economic and regulatory 
capital. The impact of the optimization is shown by an application example in 
chapter 3. We close by a brief summary in chapter 4. 




15 



2 Risk-Return Optimization Modei for the Bank Portfoiio 

2.1 Survey 

For its planning processes the bank needs to identify risk-return efficient target 
portfolios, that maximize expected returns to the planning horizon and meet risk 
constraints from different points of view. From an internal perspective, the bank 
limits its loss risks by the economic capital available. At the same time, the bank 
must observe legal loss risk boundaries of the ‘Basle rules’, that comprise con- 
straints on the risks of the banking book and the trading book, combined with lim- 
its on the capital components that cover the different kinds of risks [1,2,7]. We 
achieve the following general model structure of the bank wide risk-return portfo- 
lio optimization problem (P): 

(P) (1) 

Objective function: Maximize expected returns 
subject to constraints 

Constraint 1 : Internal risk < Economic capital. 

Constraint 2: Regulatory risk < Regulatory capital. 

Constraints on the regulatory capital components. 

Constraint 3: Position bounds, definition of the feasible solutions. 

We develop the optimization model in three steps. First we define a risk meas- 
ure that is appropriate to measure the loss risk of the bank portfolio (chapter 2.2). 
Next we introduce an optimization algorithm for the solution of the basic risk- 
return optimization problem with respect to the constraints 1 and 3 (chapter 2.3). 
We then extend the problem to an optimization model for the bank planning proc- 
ess by integrating the regulatory risk constraints (chapter 2.4). 



2.2 Definition of the CVaR Risk Measure 

While the risk measure of Value at Risk, commonly applied in finance for market 
risk measurement, lacks the elementary property of sub-additivity, if the loss dis- 
tributions are not normal, the Conditional Value at Risk (CVaR), defined as the 
conditional expectation beyond the Value at Risk, has been proved to be appropri- 
ate for risk measurement of any loss distributions [4,5]. 

Let X be the vector of the positions of the bank portfolio assets and y the vector 
of the corresponding market prices. We define the Conditional Value at Risk de- 
viation of the portfolio loss CVaRa(L(x,y)) as 

CVaR JL(x,y)) = E[L(x,y) | L(x,y) > VaR JL(x,y))] , (2) 

where L(x,y) is defined as the difference of the uncertain from the expected port- 
folio value at the horizon, L(x,y)=E[y]’x-y’x, and VaR(L(x,y)) is the a-quantile of 
the loss function L(x,y).^ In the case of discontinuities at the a-quantile, CVaR can 



^ We apply the term CVaR deviation, as the loss distribution measures the deviation of the 
uncertain from the expected portfolio value. 





16 



be defined as a weighted average of the VaR and the conditional expectation be- 
yond the VaR [5]. 

In the above definition CVaR deviation is a convex risk measure that ensures 
the existence of a risk minimum portfolio on a convex set and the solvability of 
the optimization problem (P). It is appropriate to measure loss risk from any 
asymmetric and discontinuous loss distribution with discrete probabilities [5]. The 
following figure shows the loss function and the risk measure of CVaR deviation 
at the confidence level a: 




Fig. 1. Loss distribution L(x,y) and risk measures VaR and CVaR 



2.3 Basic Risk-Return Optimization Modei 

We introduce an optimization algorithm that maximizes expected returns with re- 
spect to a CVaR risk constraint that can be applied to portfolios with any loss dis- 
tribution. We make use of this approach to solve the basic problem of the optimi- 
zation model (P) to maximize the expected portfolio return subject to the 
constraints 1 and 3. 

Let X =(xi,. . .,Xn)’ be the decision variable, i.e. the positions of the single assets. 
We define a linear objective function for the expected portfolio return |i(x)=p’x, 
with the vector of the expected returns of single assets. The internal 

loss risk is measured by CVaR deviation of the loss function L(x,y) and con- 
strained by the maximum amount of economic capital, denoted as ec_cap_max. 
The area of the feasible solutions is defined by upper and lower position bounds, 
the vectors low__bound and up_bound respectively. We solve this generalized 
risk-return portfolio optimization problem (P) by an algorithm of Rockafel- 
larAJryasev [4,5]. Based on a scenario generation yi, . . ., Yk of the market prices of 
the portfolio assets, the CVaR-constraint is approximated by a set of linear con- 
straints, leading to a linear optimization problem. We achieve the following basic 
optimization model (PcvaR) that maximizes the expected portfolio returns with re- 
spect to the constraints 1 and 3 of the optimization problem (P) : 

(PcVaR) 0) 

n 

Objective Function p(x) =p'x = 

j=i 

Constraint 1 : Internal Risk Constraint 






17 



(i) q+ >Zv<ec cap max, 

(ii) L(x,yJ-q<Zk,k=l,...JK, 

(iii) -z^ <0,k = l,...Js 

(iv) qe 91 
Constraint 3: Boundaries of the Feasible Solutions 

(vi) low_bo»Pd < X < upjbound. 



Internal loss risk 
(CVaR deviation estimate) 
< Economic capital” 



2.4 Extension to an Optimization Modei for the Bank Portfoiio 

We extend the basic optimization model (PcvaR) to an optimization model (P) 
for the bank planning processes. Beneath the internal loss risk boundaries the bank 
has to comply with regulatory rules of risk limitation passed by the Basle Commit- 
tee on Banking Supervision [1,2,7].^ We give a survey how these constraints are 
integrated into the optimization model [6]. 

The bank book positions^ are restricted with respect to their credit risk by a lin- 
ear constraint. The sum of the risk weighted assets is limited by the regulatory 
capital resources of the core (‘tier 1’) capital plus the supplementary (‘tier 2’) 
capital. Capital charges of the trading book are based on the general market risk 
and the specific risk, as well as the counterparty risk of the trading book positions. 
We model linear constraints for the specific and the counterparty risk. With re- 
spect to the general market risk of the trading book we assume that the bank ap- 
plies an internal market risk model to estimate the Value at Risk with the ‘Basle 
parameters’ [2,7]. As CVaR represents an upper bound of the corresponding VaR 
of the loss distribution, we apply a CVaR constraint on the general market risk of 
the trading book to achieve an upper bound of the regulatory VaR. We limit the 
sum of the general market risk, measured by the right hand side of the trading 
book CVaR constraint, the specific and the counterparty risk by the applicable 
regulatory capital components, that consist of the tier 1 and tier 2 capital elements 
that are not used to cover bank book risks and the tier 3 capital"^ available. Further 
we model constraints that limit the regulatory capital components [1,2,6]. 



3 Application Example 

We illustrate the effects of the risk-return optimization by a simplified applica- 
tion example [6]. An XY-Bank consists of four typical bank assets: asset 1 repre- 
sents high quality bank bonds (rating AA), asset 2 corporate bonds (rating A), as- 



^ We consider the prevailing regulatory risk limitation rules. The model allows a transition 
to the new ‘Basle IF rules [3], that will require different input data, but will not influence 
the basic structure of the risk constraints. 

^ In brief, the bank book comprises all ‘non trading’ assets, while the trading book com- 
prises all positions the bank is holding for trading purposes (precise definition see [2]). 

^ The tier 3 capital mainly comprises subordinate short term debt [2,7]. 




18 



set 3 industrial loans (rating B) and asset 4 a trading portfolio that is dependent on 
an equity index. In the actual situation, the regulatory capital of the bank is used at 
93.80% and cannot be increased in the next business year. The initial portfolio 
uses 76.20 units of economic capital. 

In order to gain additional profits, the managing board considers to raise the 
economic capital, i.e. the level of internal risk by additional undisclosed reserves 
of 17.10 units. The managing board wants to know, if the additional economic 
capital will lead to higher returns in the next business year and will be suitable to 
improve the portfolio risk-return relations and to meet the internal hurdle-rate of 
an overall return on risk adjusted capital (RORAC), that is defined as the ex- 
pected portfolio return divided by the portfolio CVaR deviation, of 14.00%. 

We apply the optimization model (P) with increasing CVaR levels to generate 
the efficient line as described in the Fig. 2 below. The initial portfolio is denoted 
by PF 0, the optimal portfolio at the given level of risk by PF 1 and the optimal 
portfolio with the increase of the economic capital by PF 2. 





1 1 1 




1 no 




PFl 




rzzz 


o 

p 


C 

B 12 00 




V 






hri 




1 1 nn 














in c\f\ 






j 










lU.UU 

CL 

Q on 








PFO 










« 9.00 ^ 

8 on 
























O.UU i 
60. 


00 70. 


o 

o 




^ ' “■ 
Internal Risk 


00 

Level (CVaR) 


90. 

3) 


00 


100 



Fig. 2. Efficient Line of the XY-Bank Portfolio Resulting of Applications of (P) 



Analyzing the efficient line, we observe that to the left of the CVaR level PI 
only the internal risk constraint is active and the regulatory capital is not com- 
pletely used. To the right of the CVaR level P2 only the regulatory constraint is 
active, as in this interval we achieve a stationary solution due to the regulatory 
capital constraint and the economic capital is not totally devoted. In the interval 
[P1,P2] both capital constraints are active, i.e. both capital resources, the eco- 
nomic and the regulatory capital, are maximal utilized. 

A comparison of the optimal portfolios PFl and PF 2 shows that the increase of 
the economic capital leads to higher expected returns, however the RORAC of 
PF 2 of 13.57% does not meet the hurdle rate of 14.00% and is even lower than 
the RORAC of the initial portfolio PF 0 of 13.98%. Also, the economic capital 
constraint is not active and the economic capital cannot be maximal used in the 
portfolio PF 2. 

We deduce that an increase of the economic capital is not advisable and that the 
actual level of risk should be maintained. The expected return of the portfolio PF 1 
improves the expected return of the initial portfolio PF 0 by 0.31 units at the given 
level of internal risk. Its portfolio RORAC of 14.39% complies with the internal 
hurdle rate. Both capital resources, the economic and the regulatory capital, can 
maximal be used. Analyzing the risk-return relations along the efficient line we 
find that a maximal RORAC can be achieved in the interval [68.9,71.0]. However, 






19 



the implementation of a RORAC optimizing strategy would require to reduce ab- 
solute volumes and expected returns. This might be conflicting with other corpo- 
rate goals and may not be supported by the shareholders. 



4 Conclusion 

We have introduced a risk-return optimization model for the bank portfolio, 
that can be applied in the planning processes of the bank in order to identify risk- 
return efficient target portfolios. It maximizes the expected returns subject to in- 
ternal and regulatory risk constraints. It is based on the new risk measure of 
CVaR, which has is appropriate for bank wide portfolio risk measurement. The 
optimization problem can be solved by linear programming techniques. The opti- 
mization model generates consistent planning information. It spots risk-return ef- 
ficient portfolios and finds intervals of maximal use of both capital resources, the 
available economic and regulatory capital, and of highest portfolio RORACs, thus 
providing basic information for a bank wide risk-return management process and 
contributing to an enhancement of the competitive position of the bank. 



References 

[1] Basle committee on Banking Supervision (1988): International convergence of capital 
measurement and capital standards, Basle, July 1988. 

[2] Basle committee on Banking Supervision committee (1996): Amendment to the capital 
accord to incorporate market risks, Basle, January 1996. 

[3] Basle Committee on Banking Supervision (2001), Consultative Document: The New 
Basel Capital Accord, January 2001, Basel, January 2001. 

[4] Rockafellar, R. T. and Uryasev, S. (2000): Optimization of Conditional Value-At-Risk, 
The Journal of Risk, Vol. 2, No. 4, pp. 21-51. 

[5] Rockafellar, R. T. and Uryasev, S. (2002): Conditional Value-at-Risk for General Loss 
Distributions, Journal of Banking and Finance, TUI. 

[6] Theiler, U. (2002): Optimization Approach for the Risk-Retum-Management of the 
Bank Portfolio, Wiesbaden (in German). 

[7] United States Accounting Office (1998): Risk-Based Capital - Regulatory and Industry 
Approaches to Capital and Risk, Washington, July 1998. 





Resource-orientated Purchase Planning and 
Supplier Selection - 

Models and Algorithms for Supply Chain 
Optimization and E-Commerce 



Gabriele Reith-Ahlemeier 

POM Prof. Tempelmeier GmbH, Am Kapellenbusch 13, 50374 Erftstadt, 
Germany, http: // www.pom-consult.de, gabriele.reith-ahlemeier@pom-consult.de 



Abstract. With the internet being available everywhere companies are able to 
automate parts of their global business processes. In this paper we consider the 
problem of simultaneous supplier selection and purchase order sizing. Significant 
cost-savings axe possible by integrating these problems in a software-based Business- 
to-Business environment. 

Today a purchasing agent often has to act without any algorithmic assistance. 
He has to order several items under dynamic demand conditions and with restricted 
material handling and storage capacities. Moreover, suppliers often offer quantity 
discounts which can vary over time. They furthermore may only be able to deliver 
on some special days and may insist on a minimum order volume. 

We develop and evaluate three new heuristic procedures - based on a new model 
formulation - to support the described decision process. 



1 Decision Problem 

In a typical industrial purchasing environment the purchasing agent has the 
task of supplier selection and order sizing - given several items with deter- 
ministic, dynamic demand for a finite planning horizon. Regarding to the 
potential suppliers, there might be a preselection process based on qualita- 
tive criterias. The remaining suppliers differentiate by fixed ordering costs 
and all-units or incremental quantity discounts. The discount structures may 
vary over time, e.g. due to marketing actions. Furthermore, the purchasing 
agent has to pay attention to product- and supplier-specific delivery times 
and limited capacities, such as material handling and storage capacities or 
limited transportation volumes. Finally, some suppliers insist on minimum 
order volumes and a given delivery schedule. 

The purchasing agent’s aim is to procure all items in time and for a 
reasonable price. Relevant cost components are beside the product prices 
price-dependend holding costs and product-specific fixed purchasing costs as 
well as fixed purchasing costs that arise independent from the number of 
items ordered from a certain supplier in a given period. 




21 



Only very few literature adresses the described operational purchasing 
problems including supplier selection. Benton and Rubin [1], [5] present heuris- 
tics to solve such problems, but they consider stationary demand and only 
all-units discount structures. Generally, the literature on order sizing under 
consideration of quantity discounts comprises only a small number of publica- 
tions. Recent overviews are provided by Benton and Park [2] and by Munson 
and Rosenblatt [3]. Up to our knowledge, currently there is no solution ap- 
proach available that fully solves the above described decision problem. 

2 Models 

The problem can be modeled as a single level, capacitated dynamic lot sizing 
problem. Unfortunately, with this formulation it is very complicated to model 
price-dependend holding costs, as the holding costs are evaluated depending 
on the stock and as, on account of quantity discounts, we can not evaluate 
the complete stock with the same price. Therefore, we base our model formu- 
lation on the well-known analogy between the plant location problem and the 
dynamic lot sizing problem. We interprete the demand of an item in a given 
period as customer location and every possibility to order that demand from 
a supplier in a certain period and on a certain discount level as a potential 
plant location. By that, the transportation distance is defined by the number 
of periods between demand and delivery period - that is exactly the interval 
an item has to be kept in stock. In this way we are able to trace back the 
prices that were paid. 

The model is structured as follows: The objective function minimizes the 
above described relevant cost components. Several restrictions are required 
to ensure the given capacity constraints. Moreover, there are some specials 
in contrast to standard plant location problems. At first, in every period no 
more than one discount class of a certain supplier can be taken up. Therefore, 
it must be ensured that the other classes of that supplier (interpreted as 
alternative plant locations) are locked. Secondly, backorders are not allowed, 
what means, that demand must be satisfied in time. So not every supplier’s 
offer (once again interpreted as plant location) can be used to satisfy the 
demand (as customer location), but only that one from earlier periods. See 
Reith-Ahlemeier [4] for the exact model formulation. 

Note that the model captures planning situations where some suppliers 
offer all-units discounts and other use an incremental discount scheme for 
the same item. It is even possible that a supplier switches from all-units to 
incremental discounts within the planning horizon of the model. 

3 Heuristic Procedures 

In view of its complexity (belonging to the class of NP-complete problems) 
the developed model can be solved exactly only for rather small problem 





22 



instances. Therefore, to solve the model in a routine planning environment, 
a heuristic solution procedure is required. Such a solution procedure must be 
fast and - with regard to varying constraints in industrial practice - it must 
be easily extendible. 

We develop a simple local search heuristic as well as two heuristics based 
on Lagrangean relaxation - one of these additionally using the concept of 
branching & bounding. Reith-Ahlemeier [4] gives a detailed description of all 
three heuristics. 

Local Search Heuristic. The local search heuristic consists of two 
phases. The first phase starts with neglecting any capacity constraints. The 
remaining single-item order quantity problems are solved by the heuristic 
ISSOS-procedure proposed by Tempelmeier [6]. An initial feasible solution 
is created by considering each type of capacity constraint in sequence. In 
case of violation, several options to attain feasibility by splitting, combining 
or shifting orders are considered, realizing that one, which causes the lowest 
total costs. The second phase separates the total costs into fixed ordering, 
variable ordering and holding costs and tries to reduce them by focussing on 
each component in turn - only considering options that ensure feasibility of 
the order schedule with respect to all capacity constraints. Fig. 1 summarizes 
the overall structure of the heuristic. 



Fig. 1. Structure of the local search heuristic 



Phase I: Initial solution 




For each item, solve a single-item order size problem, neglecting the capacity 
constraints. 


If the order schedule violates any capacity constraints, construct a feasible 
solution by shifting, combining and splitting orders. 


Phase II: Improvement steps 




Step 1: Take measures to reduce the fixed ordering costs. 


Step 2: Take measures to reduce the variable ordering costs. 


Step 3: Take measures to reduce the holding costs. 



Lagrange Heuristic. The Lagrange heuristic is based on the relaxation 
of all capacity constraints and the constraint to ensure demand fullfillment. 
The Lagrangean multipliers are adjusted via standard subgradient optimiza- 
tion techniques. To provide a lower bound in each iteration the arising relaxed 











23 



problem is decomposed into several continuous and binary knapsack prob- 
lems. For that, surrogate constraints - enabling stronger bounds without 
restricting the solution space - are set up to ensure sufficient order volume. 
Feasible solutions are developed based on the current Lagrangean multipliers, 
using the above described local search heuristic. An overview over the steps 
performed to find the heuristic solution is given in Fig. 2. 

Fig. 2. Structure of the Lagrange heuristic 



Lagrangean relaxation of all complicating restrictions. 



Data-based problem reduction. 



Subgradient optimization. 

While the current iteration number does not exceed a given maximum number 
of iterations: 



Solve the Lagrangean subproblem to attain a lower bound (LB): 
Decompose the problem in binary and continuous knapsack problems with 
the help of surrogate constraints. 


Is the solution of the actual subproblem a new LB? 
did the LB not change since a given number of iterations? 




Find a feasible solution as upper bound (UB) based on the current 
Lagrangean multipliers. 




LB = (1-range of tolerance) • UB? 




Stop. 


Adjust the Lagrangean multipliers. 



Select the actual UB as a solution of the order quantity problem. 



Branch &: Bound Heuristic. To develop a branch & bound heuristic the 
model is relaxed in the same way as for the Lagrangean one. Once again the 
subproblems are decomposed in continuous and binary knapsack problem. 
Differing to the Lagrange heuristic the surrogate constraints now limit the 
number of alternative offers (interpreted as locations) that can be used. The 
resulting bounds are not as strong as before, but as the subproblems can be 
solved much faster more iterations are possible. In each iteration, the problem 
can be reduced by further reducing the number of alternative offers. Further- 
more a certain amount of variables can be fixed heuristically, depending on 
the modification of the total costs. Fig. 3 shows the special elements of the 
branch & bound heuristic. 















24 



Fig. 3. Special structure of the branch &: bound heuristic 



Branch bound. While variable fixation is possible: 



Choose a branch variable. 


Subgradient optimization. 

While the current iteration number does not exceed a given maximum 
number of iterations: 




Solve the Lagrangean subproblem to attain a lower bound (LB): 
Decompose the problem in continuous and simplified binary knapsack 
problems with the help of surrogate constraints. 


lb > UB? 


Bounding. 


Problem reduction. 




Heuristic fixation of variables. 



Select the actual UB as a solution of the order quantity problem. 



4 Numerical Results 



To compute an exact solution for at least small problems we used CPLEX 6.6 
for parallel computers running on a UNIX-based host using eight parallel pro- 
cessors. Depending on the problem data, solution times required for finding 
the exact solutions for problem instances with 3 products, 15 periods, 5 dis- 
count levels and 3 suppliers ranged between a few seconds and several hours. 
The heuristics were implemented in Visual Basic 6.0 on a Pentium-based Win- 
dows 2000 personal computer with 400Mhz clock speed. The computation of 
heuristic solutions only took a few seconds. Not surprisingly, the local search 
heuristic was the fastest one. But also the computation times of the other 
heuristics were very small and did not vary as much as for the exact solution 
procedure (see the coefficient of variation (CV) in Table 1). Although the 
branch & bound heuristic on average was faster than the Lagrange heuristic, 
its coefficient of variation is much bigger. 

Varying the demand patterns, the order cycles and the capacity restric- 
tions as well as the scheme for delivery periods and for the time- varying price 
structures we created 1440 instances. Table 1 summarizes the average results. 














25 



Table 1. Average solution quality and computation time 



Heuristic 


Deviation from optimum (%) 


Computation time (sec) 


cv 


Local search 


2.47 


<1 


- 


Lagrange 


2.05 


30.58 


0.20 


B&B 


1.98 


16.22 


0.90 



Even the local search heuristic was able to compute solutions with an 
average deviation of only 2.5 percent from the exact ones. The other heuristics 
improved this value to only two percent. 

Unfortunately it is not possible to test the solution quality for larger prob- 
lems. Only the time to compute a heuristic solution can be measured. The 
extension of the planning horizon to thirty periods by ordering ten products 
led for the local search heuristic to an average effort of 8.33 seconds. The 
Lagrange heuristic required an average computation time of 213 seconds, 
whereas the branch & bound heuristic demanded for 1613.31 seconds. This 
confirms the observation for the smaller problems, that the latter heuristic is 
much more sensitive to problem data. 

Summarizing the results, the developed heuristics are suitable for applica- 
tion in an interactive environment. Especially the local search heuristic takes 
very short time to generate a solution with high quality. The advantages of 
the other heuristics depends on problem data, none of them outperforms the 
other one. 



References 

1. Benton, W. C. (1991) Quantity discount decisions under conditions of multi- 
ple items, multiple suppliers and resource limitations. International Journal of 
Production Research 29(10): 1953-1961 

2. Benton, W. C., Park, S. (1996) A classification of literature on determining the 
lot size under quantity discounts. European Journal of Operational Research 92: 
219-238 

3. Munson, C. L., Rosenblatt, M. J. (1998) Theories and realities of quantity dis- 
counts: an exploratory study. Production and Operations Management 7(4): 
352-369 

4. Reith-Ahlemeier, G. (2002) Ressourcenorientierte Bestellmengenplanung und 
Lieferantenauswahl - Modelle und Algorithmen fiir Supply Chain Optimierung 
und E-Commerce. Books on Demand, Norderstedt 

5. Rubin, P. A., Benton, W. C. (1993) Jointly constrained order quantities with 
all-units discounts. Naval Research Logistics 40: 255-278 

6. Tempelmeier, H. (2003) A simple heuristic for dynamic order sizing and suplier 
selection with time- varying data. Production and Operations Management, to 
appear 





A Combinatorial Approach to 
Orthogonal Placement Problems 



Gunnar W. Klau 

Konrad-Zuse-Zentrum fiir Informationstechnik Berlin (ZIB), klau0zib.de 



Abstract. This article presents the main results of a PhD thesis that deals with 
two families of iVP-hard orthogonal placement problems. We develop a common 
combinatorial framework for compaction problems in graph drawing and for label- 
ing problems in computational cartography. Compaction problems are concerned 
with performing the conversion from a dimensionless description of the orthogonal 
shape of a graph to an area-efficient drawing in the grid. Map labeling is the task of 
attaching labels to point-features so that the resulting placement is legible. On the 
basis of new combinatorial formulations for these problems we develop exact algo- 
rithms. Extensive computational studies on real-world benchmarks show that our 
linear programming-based algorithms solve large instances of the placement prob- 
lems to provable optimality within short computation time. Often, our algorithms 
are the first exact algorithms for the respective problem variant. 



We analyze two families of orthogonal placement problems that arise in 
the area of information visualization. The first family, compaction of orthog- 
onal grid drawings, is concerned with performing the conversion from a di- 
mensionless description of the orthogonal shape of a graph to an area-efficient 
drawing in the orthogonal grid with short edges. This two-dimensional com- 
paction problem emerges in the last phase of a powerful method for high- 
quality orthogonal graph drawing, the topology-shape-metrics scheme. The 
second family of problems plays an important role in the area of computa- 
tional cartography and deals with the task of attaching rectangular labels to 
point- features such as cities or mountain peaks on a map. 

It is common to both drawings of graphs and cartographic maps that 
they convey complex information about relations of objects as a geometric 
representation. Moreover, the utility of this representation depends on the 
quality of the layout process. The overall aim is to generate a drawing or 
a map of maximum readability that is intuitive to understand and use and 
which effectively communicates the underlying information. As an example, 
Fig. 1 shows “good” and “poor” solutions of orthogonal placement problems. 

Why are the orthogonal placements in Fig. 1(a) and (c) better than those 
of Fig. 1(b) and (d)? The edges of the right orthogonal drawing are unnec- 
essarily long, confuse the reader, and increase the amount of drawing space 
needed. The compact drawing on the left has been enlarged by 25% and is 
still more area-efficient. Due to the better resolution and shorter edges it is 
superior to the right drawing. Even more obvious reasons make the left label- 
ing a better one than the right one: In Fig. 1(d), information is lost since not 




27 




(c) (d) 

Fig. 1. Good and poor solutions of orthogonal placement problems 



all labels are placed. Furthermore, many of the labels overlap which makes 
it difficult to extract the necessary information. 

A further common characteristic to both areas is that most of the respec- 
tive problems are computationally hard to solve. Everybody who has tried to 
draw a graph with 20 vertices by hand knows about the difficulties of find- 
ing an aesthetically pleasing drawing. Even if this task can be accomplished, 
it remains a tedious, complicated, and time-consuming process. The same 
applies to the placement of labels on a map. 

Due to space limitations, this article can only give an introduction to the 
techniques developed in the thesis. We refer to [9] for details. 



Compaction in Orthogonal Graph Drawing 

The area of automatic graph drawing is devoted to the development of algo- 
rithms that produce geometrical representations of graphs and to the prob- 
lems that arise within this context, see, e.g., [5] and [8]. Here, we focus on 
orthogonal drawings, i.e., drawings of graphs in which the edges are repre- 
sented by sequences of alternating vertical and horizontal line segments. For 
numerous applications, orthogonality is a convention {e.g., UML-diagrams, cir- 
cuit layouts, entity-relationship diagrams), and for many other applications 
orthogonal drawing algorithms produce the best layouts. 

Due to the oftentimes confiicting aesthetic criteria, tradeoffs cannot be 
avoided. For orthogonal drawings, the following ranking is widely accepted: 
The primary goal is to minimize the number of edge crossings; ideally a graph 
should be drawn without any crossings all. Secondly, the number of bends 






28 



should be as low as possible. Finally, besides few crossings and bends, the 
edges should be short in the drawing, which also leads to good area bounds. 

According to the above ranking of criteria, the topology- shape-metrics 
scheme^ first mentioned in [2], leads to the best results. It divides the draw- 
ing task into three phases. The first phase {planarization) aims at minimizing 
the number of crossings. Initially, it identifies a small set of edges whose re- 
moval results in a planar subgraph and then computes a planar embedding 
for this subgraph. Then, the temporarily deleted edges are reinserted at the 
combinatorial level so that the number of crossings is low. In order to main- 
tain a planar graph, every crossing is replaced by an artificial vertex. The 
orthogonalization phase deals with determining the orthogonal shape of the 
resulting drawing. Here, the optimization goal is to minimize the number of 
bends that occur along the edges of the drawing. While it is NP-hard to 
minimize this number over all embeddings of a planar graph, the problem 
can be elegantly solved for a fixed embedding by solving a minimum-cost 
flow problem [11]. The output of the orthogonalization phase is a so-called 
orthogonal representation that contains the necessary information about the 
topology and the orthogonal shape of the drawing. Yet, the description is 
dimensionless and coordinates still have to be assigned to the vertices and 
bends of the drawing. 

The compaction phase must transform an orthogonal representation into a 
drawing with small total edge length or little area. Again, this is an JVP-hard 
problem [10]. Previous algorithmic research for this problem can be divided 
into constructive and improvement heuristics: Basic constructive heuristics 
perform a rectangular dissection of the given orthogonal representation [11,6]. 
In [3] these techniques are extended by introducing the concept of turn- 
regularity. However, the results still admit room for considerable improve- 
ment. Improvement heuristics such as the compression-ridge method [1,4] 
and graph-based compaction techniques, e.p., [7], originate in the area of 
VLSi-design and consider the one-dimensional subproblems of reducing the 
horizontal or vertical edge lengths. In many cases, iterative usage of these 
heuristics with alternating direction yields considerable improvement. 

The key idea of our approach to the two-dimensional compaction problem 
is to translate it into an equivalent combinatorial problem involving a pair 
of constraint graphs. Investigating combinatorial properties of these graphs 
leads to new algorithms that can solve large instances of the compaction 
problems to optimality in short computation time. Based on the observation 
that we can treat the horizontal and vertical direction to a great extent 
separately, each directed graph corresponds to one such direction. 

We investigate how the constraint graphs must interact in order to de- 
velop a combinatorial characterization of the compaction problem. Thereby, 
we exploit the fact that, due to the given shape, many relative positions of 
vertices, edges, and bends are already determined. We introduce the shape 
graphs^ that reflect the orthogonal shape of the input. Moreover, we identify 





29 



a central path- and cycle-based property of constraint graphs, completeness^ 
that forms the link between the otherwise unconnected horizontal and ver- 
tical graphs. Substantially, a complete pair of constraint graphs consists of 
two acyclic graphs in which each pair of nodes is separated by one of four 
paths. Each of these paths corresponds to one of the four possible relative 
placements of a pair of objects in two dimensions. We show that, for com- 
plete placement graphs, the compaction problem reduces to two separate 
one-dimensional problems for which optimal solutions lead to an optimal 
solution of the problem in two dimensions. 

The shape graphs are uniquely determined by the given orthogonal repre- 
sentation. In case of complete shape graphs, we can solve the two-dimensional 
compaction problem in polynomial time. Furthermore, we investigate the one- 
dimensional compaction scheme and demonstrate that instances exist for 
which a linear number of alternating compaction steps is necessary. More- 
over, we show that algorithms within the scheme do not approximate the 
compaction problem by a constant factor. 

In general, the shape graphs are not complete and we identify a set of 
potential additional arcs. We show that the set of complete extensions that 
results from adding certain subsets of potential arcs to the shape graphs is in 
one-to-one correspondence to the feasible solutions of the original problem. 
It is the choice of potential arcs that makes the compaction problem difficult 
at the combinatorial level. However, we can characterize those shape graphs 
that admit a unique extension and solve the compaction problem in poly- 
nomial time for these instances. Otherwise, we translate the combinatorial 
problem into an integer linear program whose feasible solutions correspond 
to feasible solutions of the original compaction problem and vice versa. This 
enables us to optimize over the set of feasible orthogonal drawings for a given 
instance, and we present both a branch- and-bound and a branch-and-cut al- 
gorithm to solve the two-dimensional compaction problem to optimality. We 
test our implementations on a large set of widely used benchmark-graphs 
from different test-suites, including a set of 11,582 graphs arising from real- 
world applications. Our extensive computational study shows that we can 
solve all real-world problem instances in short computation time. 

Map Labeling 

Map labeling problems attract many researchers in computer science. On 
the one hand, this is due to its numerous applications, e.p., in cartography, 
geographic information systems, point pattern analysis, spatial statistics, and 
graphical interfaces. On the other hand, many combinatorial optimization 
problems with beautiful mathematical properties appear in this area. For an 
overview on this subject see the bibliography [12]. In this article, we focus on 
point-feature label placement in which the task is to place labels adjacent to 
point features such as cities, mountain peaks, or points in a statistical plot 
so that no labels overlap. We concentrate on the six different labeling models 





30 



(a) Four-position 



(d) Four-slider 



(b) Two-position 




(e) Two-slider 




(c) One-position 



(f) One-slider 



Fig. 2. Axis-parallel rectangular labeling models. A label can be placed in any of 
the positions indicated by the rectangles and can slide in the directions of the arcs 



in Fig. 2. The discrete or fixed-position models (Fig. 2(a)-(c)) allow only a 
finite number of positions per label. More natural are slider models in which 
a label may move continuously around its point-feature, see Fig. 2(d)-(f). 

In general it is not possible to place all the given labels in their original 
size without any overlap. We focus on the label number maximization problem 
where the task is to place the maximum number of non-overlapping labels 
without changing their sizes. Again, we associate a pair of constraint graphs 
with problem instances; the key idea is the same as for the two-dimensional 
compaction problems: If these graphs satisfy certain path- and cycle-based 
properties, we can produce a solution for the original problem by separately 
assigning values to the nodes of the constraint graphs. These values corre- 
spond to the X- and ^-coordinates of the labels. 

For a given instance of a labeling problem we construct a special pair of 
labeling graphs. We introduce different kinds of arcs whose presence satisfy 
necessary properties of feasible label placements: The fixed distance arcs en- 
sure that the relative position between the point-features remains fix. Label 
size arcs guarantee that every label is represented by a rectangle of width and 
height as described in the input. By introducing the proximity arcs we deter- 
mine the rectangular region around the appropriate point-feature in which a 
label can be placed. In order to exclude that a label covers the point-feature 
it belongs to, we define the boundary arcs that are inverse to the proxim- 
ity arcs. Unlike the previously introduced types of arcs, the boundary arcs 
belong to the class of potential arcs and infiuence the labeling model. Each 
discrete or slider model corresponds to requirements on subsets of boundary 
arcs that have to be present in the labeling graphs. We define a second type 
of potential arcs in order to control the overlaps between labels. The label 
separation arcs make sure that pairs of labels do not overlap in a placement. 

We can now restate the pure labeling problem in which all labels have to 
be placed without scaling and overlaps as the identification of a subset of po- 
tential arcs that satisfies the following two properties. First, the set of chosen 
boundary arcs must comply with the appropriate labeling model. The second 
property extends the notion of completeness as defined for the compaction 




31 



problems: At least one label separation arc has to be chosen for each label 
pair, and adding the chosen potential arcs to the labeling graphs must not in- 
duce directed cycles of positive weight. We show that the combinatorial refor- 
mulation is equivalent to the pure labeling problem by establishing a one-to- 
one correspondence between feasible solutions. Furthermore, we demonstrate 
how to adapt the new combinatorial problem to result in equivalent formu- 
lations of the label number maximization problem. We find it remarkable 
that our new approach is independent of the labeling model and results in 
discrete formulations even if the problems are of continuous nature as in the 
slider models. The combinatorial formulation for the pure labeling problem 
admits a straightforward characterization as a zero-one polytope through an 
incidence vector for the set of potential arcs. We provide an integer linear 
programming formulation for this polytope by describing feasible solutions 
of the combinatorial version of the pure labeling problem with classes of 
inequalities and integrality constraints. For one class of inequalities, the pos- 
itive cycle inequalities^ we investigate the corresponding separation problem 
and show that it is NP-complete. We present an extended formulation that 
evades the class of positive cycle inequalities. Our integer linear programs 
for the label number maximization problem are not as straightforward. We 
develop a first formulation with an additional binary variable vector that 
represents the decision to place or not to place a label. We integrate the new 
variables in the existing inequalities and show that feasible solutions of the 
resulting formulations correspond to an overlap-free labeling for a subset of 
labels. In a second formulation we manage to eliminate the newly introduced 
decision variables by a substitution step. However, we have to add additional 
inequalities to adjust the objective function. 

We present branch-and-bound and branch-and-cut algorithms and an iter- 
ative branch-and-bound scheme for the zero-one and extended formulations. 
The algorithms work in all labeling models and are the first exact algorithms 
for the continuous slider models. We provide extensive computational ex- 
periments in which we test our new algorithm on a large set of benchmark 
data. The results show that the exact algorithms produce provably optimal 
solutions for large instances in reasonable computation time. 

We show how to combine our approaches to compaction and labeling 
problems in order to devise first algorithms for the interesting class of graph 
labeling problems. In particular, we consider a problem that occurs in the 
area of automation engineering: Simultaneous drawing and labeling of state 
diagrams. Concluding, we believe that our combinatorial characterizations 
are expansible and suitable to apply them to many related problems like, 
e.g.^ packing or location problems. 

References 



1. S. Akers, M. Geyer, and D. Roberts. IC mask layout with a single conductor 
layer. In Proc. 7th Des. Autom. Workshop, pages 7-16. ACM/IEEE, 1970. 





32 



2. C. Bat ini, E. Nardelli, and R. Tamassia. A layout algorithm for data-flow dia- 
grams. IEEE Transactions on Software Engineering^ SE-12(4):538-546, 1986. 

3. S. Bridgeman, G. Di Battista, W. Didimo, G. Liotta, R. Tamassia, and L. Vis- 
mara. Turn-regularity and optimal area drawings of orthogonal representations. 
Computational Geometry: Theory and Applications, 16(l):53-93, 2000. 

4. W. W.-M. Dai and E. S. Kuh. Global spacing of building-block layout. In C. H. 
Sequin, editor, VLSI ^(97, pages 193-205. Elsevier Science, 1987. 

5. G. Di Battista, R Eades, R. Tamassia, and I. G. Tollis. Graph Drawing. Algo- 
rithms for the Visualization of Graphs. Prentice Hall, 1999. 

6. F. Hoffmann and K. Kriegel. Embedding rectilinear graphs in linear time. 
Information Processing Letters, 29(2):75-79, 1988. 

7. M.-Y. Hsueh. Symbolic layout and compaction of integrated circuits. Technical 
Report UCB/ERL M79/80, Univ. of California, Berkeley, CA, U.S.A., 1979. 

8. M. Kaufmann and D. Wagner, editors. Drawing Graphs: Methods and Models, 
volume 2025 of Lecture Notes in Computer Science. Springer, 2001. 

9. G. W. Klau. A Combinatorial Approach to Orthogonal Placement Problems. 
PhD thesis, Univ. d. Saarlandes, Saarbriicken, Germany, September 2001. 

10. M. Patrignani. On the complexity of orthogonal compaction. Computational 
Geometry: Theory and Applications, 19(l):47-67, 2001. 

11. R. Tamassia. On embedding a graph in the grid with the minimum number of 
bends. SIAM J. Comput, 16(3):421-444, 1987. 

12. A. Wolff and T. Strijk. The map labeling bibliography, http: //www. math- inf . 
uni-greifswald.de/map-labeling/bibliography. 





Assigning Frequencies in GSM Networks* 



Andreas Eisenblatter 

Konrad- Zuse-Zentrum fur Informationstechnik Berlin (ZIB) 
Takustr. 7, D-14195 Berlin, Germany 



Abstract. Mobile communication is a key technology in today’s information age. 
Despite the ongoing improvements in equipment design, interference remains a lim- 
iting factor for the use of radio communication. The author investigates in his PhD 
thesis how to largely prevent interference in GSM networks by carefully assigning 
the available frequencies to the installed base stations. The topic is addressed from 
two directions: first, new algorithms are presented to compute “good” frequency 
assignments fast; second, a novel approach, based on semidefinite programming, is 
employed to provide lower bounds for the amount of unavoidable interference. 

The proposed new methods for automatic frequency planning axe compared 
in terms of running times and effectiveness in computational experiments using 
instances from practice. For most of the heuristics the running time behavior is 
suited for interactive planning, and they provide good assignments from a practical 
point of view. Several of these methods are successfully employed by the German 
GSM operator E-Plus Mobilfunk GmbH &; Co. KG. 

The best lower bounds on the amount of unavoidable (co-channel) interference 
are presently obtained from solving semidefinite programs. These programs arise as 
nonpolyhedral relaxation of a minimum fe-partition problem on complete graphs. 
The success of this approach is underpinned by revealing structural relations be- 
tween the solution set of the semidefinite program and a polytope associated with 
an integer linear programming formulation of the minimum fe-partition problem. 
Comparable relations are not known to hold for any polynomial time solvable poly- 
hedral relaxation of the minimum /c-partition problem. The application described 
is among the first of semidefinite programming to large industrial problems in com- 
binatorial optimization. 



1 Introduction 

The General System for Mobile communication or, for short, GSM is nowa- 
days the predominant technology for mobile communication. More than half a 
billion people in over 150 countries use GSM for mobile telephony and for ex- 
changing short text messages (SMS). Within a decade, GSM has grown from 
a costly service used by few professionals to a mass market with penetration 
rates higher than 70% in Finland and Iceland. In some countries, the mobile 
phone subscribers already outnumber the fixed-line telephone subscriptions. 

The mobile communication relies on a radio link between the user’s mobile 
phone and some stationary base station, which is part of a GSM operator’s 
infrastructure, see Fig. 1. Currently, a base station typically serves three 

* This presentation is based on the Ph.D. thesis of the author [5]. 




34 



^ ' ^ ^ cell 




different areas (cells) with up to 6 transmitters. Each transmitter uses a fre- 
quency slot of 200 kHz, called channel, to handle at most 6-8 users in parallel 
via time multiplexing (time division multiple access, TDM A). Nearby trans- 
mitters have to use different channels (frequency division multiple access, 
FDMA). As with all forms of radio communication, the limited radio spec- 
trum is a bottleneck. National regulation authorities usually license between 
60-120 channels of radio bandwidth to GSM operators. 

An operator has to reuse his channels multiple times to operate the several 
tens of thousands transmitters, which are typically installed in a network. 
Each radio link, however, requires a signal of sufficient strength which, at the 
same time, is not suffering too severely from interference by other signals, 
see Fig. 2. Significant interference may be caused by transmitters using the 
same channel (co-channel) or an adjacent channel. 




Fig. 2. Field strength and interference: (a) inhomogeneous decay of a signal’s field 
strength (path loss) in an urban environment (by courtesy of E-Plus); (b) estima- 
tion of interference in terms of affected cell area 





35 



The reuse of channels is therefore limited, and frequency planning turns 
into a key issue in fully exploiting the available radio spectrum. Notice that 
it is customary to use the word frequency as a synonym for channel in this 
context. By avoiding interference, frequency planning has a significant impact 
on the quantity as well as on the quality of the radio communication services. 

Frequency assignment is usually performed at the end of a chain of plan- 
ning activities. The placement of the base station as well as the selection 
and configuration of their antennas are the basis for delivering the desired 
network coverage. The subsequent decisions on how many transmitters to op- 
erate in each cell build the foundation for the desired network capacity. The 
final step of assigning the frequencies “merely” has to ensure that the cov- 
erage and capacity goals can be met: namely, by providing each transmitter 
with a frequency that is (locally) at most moderately interfered. 

The Ph.D. thesis [5] addresses several topics, ranging from the techni- 
cal background of the GSM frequency planning problem (Chap. 2) over 
alternative mathematical models (Chap. 3) and heuristic planning meth- 
ods (Chaps. 4, 5) to quality assessments for the generated frequency plans 
(Chaps. 6-8). An overview is given in the following. The theory (Chaps. 7, 
8) underlying the computation of unavoidable interference for the quality 
evaluation, however, is not addressed here. 

Much of this work is related to a cooperation between the ZiB and the Ger- 
man GSM 1800 network operator E-Plus Mobilfunk GmbH & Co. KG. The 
focus of the cooperation was primarily on fast frequency planning heuristics 
for the use in the regular radio planning process at E-Plus. New planning 
methods were developed at ZiB and integrated into E-Plus’ software envi- 
ronment. In 1997, the new software was first used successfully in practice. A 
series of extensions have meanwhile been implemented. 

2 Optimization Model for GSM Frequency Assignment 

The frequency planning problem sketched above can be formalized as a com- 
binatorial minimization problem. An undirected graph G = (V, E) is defined 
together with vertex- and edge-labelings. A vertex is introduced for each 
transmitter (demand for one frequency). An edge is introduced whenever 
there is an interdependency between the corresponding transmitters. 

The edge-labelings record three types of interdependencies. The separa- 
tion label d{vw) is the minimum required difference of the channels assigned 
to V and w. Typical values are (0,) 1, 2, 3. The co- and adjacent channel 
interference labels c^^{vw), c^^{vw) record how much interference is incurred 
in case the same channel, respectively adjacent channels are assigned to v 
and w. Interference is normalized to values between 0 and 1. 

Each vertex label Ay specifies the set of available channels for the trans- 
mitter V. These sets are often genuine subsets of the frequency spectrum C 
licensed to an operator. Such restrictions arise, for example, along national 





36 



borders, where a cross-border coordination of the channel use is necessary 
to prevent (strong) interference between bordering networks. The licensed 
spectrum itself is usually contiguous. 

A frequency assignment or simply an assignment is a function y:V-^C. 
An assignment is feasible if every carrier v G F is assigned an available 
channel and all separation requirements are met, that is, if 



y{v) e Ay Vu G F , 

\y{v) — y{w)\ > d{vw) \/vw G E . 



( 1 ) 

(2) 



Finding a feasible assignment is closely related to coloring a graph. Fre- 
quency assignment is a generalization of list colorings and related to T- 
colorings and list T-colorings of graphs [5, Chap. 3]. Drawing on this con- 
nection, it is easily shown that finding any feasible frequency assignment is 
A^P-complete in general. 

In practice, not just some feasible assignment is of interest, but assign- 
ments that minimize the sum of co- and adjacent channel interferences are 
in demand. The corresponding optimization problem 



min 

y feasible 



E 

vwEE: 

y{v)=y{w) 



c^^{vw) 



+ ^ c^^{vw) 

vw^E: 

\y{v)-y{w)\=l 



(FAP) 



is called the frequency assignment problem. 

This model has proven useful and is largely accepted among researchers 
and practitioners. From a computational complexity point of view, however, 
optimal solutions are even very hard to approximate, see [5, Chap. 3] for 
details. Further models of practical relevance or of theoretical interest are 
discussed in [1,4-7,15]. 



3 Heuristic Planning Methods 

The focus is on planning heuristics, capable of dealing with carrier networks of 
around 2000 transmitters in a few minutes on a modern PC or workstation. 
Such methods are well-suited for practical applications, with a particular 
emphasis on intermediate iterations in the planning cycle. 

Seven heuristic planning methods are described [5, Chap. 4]: three greedy- 
type construction methods, T-Coloring, Dsatur with Costs, and Dual 
Greedy, as well as four improvement methods. Iterated 1-Opt, k-Opt, 
Vds, and Mcf. The performance of each heuristic (sometimes with aug- 
mentations) and its parameter interdependence are extensively analyzed [5, 
Chap. 5]. Most of the above methods are suited (in combination) for auto- 
matic frequency planning in practice. The Dual Greedy drops out, because 
it is slow and produces by far the poorest results. 

The concerted acting of various combinations of them is studied on the 
basis of eleven realistic planning instances, which have been made available 





37 



over the Internet [9]. Table 1 displays several characteristic parameters of 
the constraint graphs associated with the planning instances. Notice the high 
average degrees of the vertices, i. e., the large numbers of transmitters that 
may directly be affected by the frequency assignment to one transmitter. 
Notice also the large maximum clique numbers, which in most cases proves 
directly that an interference free frequency assignment is impossible. 



Table 1. Characteristics of constraint graphs of realistic planning instances [9] 




K 


267 56.57 151.0 


238 


69 


1053 


19111 


996 


50 


B[0] 


1886 


13.59 256.4 


779 


81 


7288 


234479 


4263 


75 


B[l] 


1971 


13.46 265.3 


805 


84 


7996 


253441 


4825 


75 


B[2] 


2214 


13.50 299.0 


916 


93 


10284 


320684 


6871 


75 


B[4] 


2775 


13.44 373.0 


1133 


120 


16663 


500805 


12524 


75 


B[10] 


4145 


13.41 555.9 


1704 


174 


38234 


1113850 


33548 


75 


SlEl 


930 


9.03 84.0 


209 


52 


6039 


33002 


9911 


75 


Sie2 


977 49.17 480.4 


877 182 


17761 


216912 


25615 


43 


Sie3 


1623 


9.18 149.1 


519 


78 


23093 


97861 


15069 


76 


Sie4 


2785 


10.50 292.3 


752 


100 


27964 


379052 


26445 


39 


Sw 


310 


8.29 25.7 


94 


21 


3984 


0 


2075 


3 + 49 



4 Automatic Frequency Planning 

In essence, the following observations are made [5, Chap. 5]. There is one 
particular strong combination of the fast heuristics presented: a self- tuning 
variant of the Dsatur with Costs start heuristic, combined with the Vds 
improvement heuristic. This combination achieves a decent balance between 
solution quality and running times. 

The resulting assignments are usually not much worse than those obtained 
by the elaborate Threshold Accepting method [4, Section 4.2.5]. With 
respect to the maximum incurred co- and adjacent channel interference, they 
are even sometimes better. The precise running times of Threshold Ac- 
cepting are not public, but they are roughly one order of magnitude higher 
than those of the fast heuristic combinations. In case yet faster methods are 
needed, a combination of a self-tuning variant of T-COLORING with Vds or, 
even faster, with Iterated 1-Opt may be attractive. 

Interference plots are commonly used for frequency planning. They depict 
the (likely) occurrence of interference on the basis of the signal level predic- 
tions. Figure 3 contains two such plots, where the difference in dB between 





38 



the serving sector’s signal and the second strongest signal at the same fre- 
quency is color-coded. Clearly visible are the interference reductions achiev- 
able with the proposed methods in comparison to a formerly established, 
commercial routine. (This routine has meanwhile been replaced by the tool 
vendor.) In another example more than 96% of the interference could have 
been removed [5, Chap. 5]. 




Fig. 3. Interference plots: improvements from optimization 



In order to rigorously assess the quality of the plans, lower bounds on 
the amount of unavoidable interference are in demand. By far the best lower 
bounds on the unavoidable co- channel interference are currently obtained 
from semidefinite programming. Significant bounds are given for the five sce- 
narios K, B[4], B[10], Sie2, and Sie4 [5, Chap. 6]. The reported bounds LB 
yield quality gaps, computed as 1 — LB/UB, between the provably unavoid- 
able co-channel interference and a “good” heuristic solution with co- and 
adjacent channel interference totaling to UB. The gaps are 50% for K, 77% 
for B[4], 63% for B[10], 53% for Sie2, and 66% for Sie4. From the application 
point of view, these gaps may not be satisfying. Nevertheless, they are the 
first noteworthy bounds on the gap for large realistic instances. 

The link between the bound on unavoidable co-channel interference in 
frequency planning and semidefinite programming is a semidefinite relaxation 
of the well-known graph minimum fc-partition problem [5, Chaps. 7, 8]. These 
problems are obtained by relaxing the original frequency planning problems 
as follows. Each vertex may receive any of the k frequencies in the available 
spectrum; all separation requirements are reduced to at most 1; and the 
adjacent channel interference is ignored. A lower bound for the optimal k- 
partition, and hence a bound on the unavoidable co-channel interference, is 
computed by solving the semidefinite relaxation of the fc-partition instances. 
The dual semidefinite programming solvers [3, 10] are used for this purpose. 






39 



5 Conclusions 

Planning the use of frequencies is a central task in managing a GSM network. 
It is a cornerstone for providing the desired grade and quality of service. Three 
planning situations are distinguished. 

• In the relaxed situation, a new frequency assignment is to be generated for 
a large network region. Many frequencies are available, and the objective 
is to minimize interference, thus providing radio service at high quality. 

• In the congested situation, again a new plan for large network portions is 
to be produced, but the number of available frequencies hardly allows to 
provide the desired grade of service (at the least accepted level of quality). 

• In the adaption case, the assignment shall be adapted locally to changes 
in the network. 

Each of these situations seems to call for different planning methods. For 
surveys directed towards the “congested” case, see [11,12,15]. The “adap- 
tion” case has hardly been addressed explicitly yet. The focus here is on the 
“relaxed” planning situation. 

The goal was to design algorithms for generating frequency plans quickly 
that incur as little interference as possible. They are particularly attractive 
for interactive planning processes, where alternative plans are produced for 
tentative network changes. Heuristic methods of small theoretical running 
times were proposed. Their computational behavior was analyzed on eleven 
realistic, publicly available scenarios. 

Several of these methods are successfully used at E-Plus. Better frequency 
assignments are obtained much quicker than through the previous planning 
process. The software is also incorporated into a commercial GSM radio net- 
work planning tool as the standard frequency planning component. 

The development of new planning methods seems to slow down lately. 
The lacking demand from major European GSM operators might be a rea- 
son. Market saturation is approached and the need for network expansions 
decreases. A more fundamental reason may lie in the difficulty to provide 
reliable interference predictions. These are the basis for the frequency plan 
optimization. At the current stage, the quality of a frequency assignment may 
depend more on the field strength prediction model used for interference pre- 
diction than on which modern planning heuristics is applied [4]. 

GSM is a second generation, digital system for mobile communication. 
The upcoming Universal Mobile Telecommunication System (UMTS) is a 
third generation, offering transmission rates up to 384 kbps. UMTS uses a 
fundamentally different way to support multiple radio links in parallel (code 
division multiple access). Frequency planning is no longer necessary with 
UMTS. A high price has to be paid for this convenience, however: provisioning 
coverage and capacity are tightly coupled. Base stations affect each other 
much more with respect to coverage and capacity. This spawns a new line of 
research, focusing on dimensioning UMTS radio networks [2,8, 13, 14]. 





40 



References 

1. Aaxdal K.L, van Hoesel S.C.P.M., Koster A.M.C.A., Mannino C., Sassano A. 
(2001). Models and solution techniques for the frequency assignment prob- 
lem. ZIB-report 01-40, Konrad- Zuse-Zentrum fiir Informationstechnik Berlin, 
Germany. URL http://www.zib.de/PaperWeb/abstracts/ZR-01-40/. 

2. Amaldi E., Capone A., Malucelli F. (2002). Planning UMTS base station loca- 
tions: Optimization models with power control and algorithms. IEEE Trans- 
actions on Wireless Communications^ 1. 

3. Burer S., Monteiro R.D., Zhang Y. (1999). Interior-point algorithms for 
semidefinite programming based on a nonlinear programming formulation. 
Tech. Rep. TR 99-27, Department of Computational and Applied Mathematics, 
Rice Unviversity. 

4. Correia L.M. (ed.) (2001). COST 259: Wireless Flexible Personalized Commu- 
nications. John Wiley &; Sons Ltd. 

5. Eisenblatter A. (2001). Frequency Assignment in GSM Networks: Models, 
Heuristics, and Lower Bounds. Cuvillier-Verlag. URL ftp://ftp.zib.de/ 
pub/zib-publications/books/PhD_eisenblaetter . ps . Z. 

6. Eisenblatter A., Grotschel M., Koster A.M.C.A. (2002). Frequency assignment 
and ramifications of coloring. Discussiones Mathematicae Graph Theory, 22:51- 
88. URL http://www.zib.de/PaperWeb/abstracts/ZR-00-47/. 

7. Eisenblatter A., Grotschel M., Koster A.M.C.A. (2002). Frequenzplanung im 
Mobilfunk. DMV-Mitteilungen, (l):18-25. URL http://www.zib.de/Papery, 
Web/abstracts/ZR-02-09/. In German. 

8. Eisenblatter A., Koch T., Martin A., Achterberg T., Fiigenschuh A., Koster A., 
Wegel O., Wessaly R. (2002). Modelling feasible network configurations for 
UMTS. In Telecommunications Network Design and Management, pp. 1-24. 
Kluwer Academic Publishers. 

9. FAP web (2000). FAP web — A website about Frequency Assignment Problems. 
Eisenblatter A., Koster A. URL http://fap.zib.de/. 

10. Helmberg C. (2000). Semidefinite programming for combinatorial optimization. 
Habilitationsschrift. Technische Universitat Berlin, Germany. 

11. Jaumard B., Maxcotte O., Meyer C. (1999). Mathematical models and exact 
methods for channel assignment in cellular networks. In Sanso B., Soriano P. 
(eds.). Telecommunications Network Planning, chap. 13, pp. 239-255. Kluwer 
Academic Publishers. 

12. Koster A.M.C.A. (1999). Frequency Assignment - Models and Algorithms. 
Ph.D. thesis, Universiteit Maastricht, The Netherlands. 

13. Mathar R., Schmeink M. (2000). Optimal base station positioning and channel 
assignment for 3G mobile networks by integer programming. Tech, rep., RWTH 
Aachen, Germany. 

14. MOMENTUM (2001). Models and simulations for network planning and con- 
trol of UMTS. URL http://momentum.zib.de. European Information Society 
Technologies (1ST) project, IST-2000-28088. 

15. Murphey R.A., Pardalos P.M., Resende M.G.C. (1999). Frequency assignment 
problems. In Du D.Z., Pardalos P.M. (eds.). Handbook of Combinatorial Opti- 
mization, Kluwer Academic Publishers. 





Optimal Control of Methadone Treatment in 
Preventing Blood-Borne Disease* 



Julia Almeder 

Institute for Econometrics, Operations Research, and Systems Theory, 

Vienna University of Technology, Argentinierstr. 8/119, A-1040 Vienna, Austria; 
email: julia. almeder ®rkag. at 



Abstract. In this paper an optimal control model describing the influence of 
methadone maintenance treatment (MMT) on the spread of blood-borne diseases 
like Hepatitis C Virus (HCV) and Human Immunodeficiency Virus (HIV) among 
injection drug users is analyzed. The aim of the model is to find the optimal policy 
to minimize the discounted stream of the overall costs arising from MMT and the 
social costs caused by new infections from HIV and HCV. 



1 Introduction 

Human Immunodeficiency Virus (HIV) and Hepatitis C (HCV) are the most 
common diseases among injection drug users (IDUs). It is estimated that 
about 25% of the new HIV infections in the U.S. are a result of transmission 
from needle sharing among drug users [7]. HCV is less lethal than HIV and 
therefore the social costs (e.g. treatment, medication...) associated with one 
infected person are far below the costs caused by one IDU infected with HIV. 
But HCV, which has a prevalence of more than 50% in many populations 
[2], is far more widespread than HIV. The reason for that is a very high 
infectivity rate. (A study based on data of hospital workers in the U.S., who 
were exposed to HCV through accidents with infected needles, observed a 
infectivity rate between 3 and 9% [1].) It is estimated that the possibility to 
be infected with HCV is about four times higher than the infectivity rate 
of HIV. For this reason, treatment and prevention interventions have proven 
less successful in slowing HCV infection than they have in slowing the spread 
of HIV [1,2]. 

Syringe exchange programs (SEP) and methadone maintenance treatment 
(MMT) are two of the most important instruments which can slow down 
the disease spread. SEP have no impact on the frequency and duration of 
the clients’ drug use, but they can reduce infectious disease spread through 
provision of sterile syringes. In contrast to that, MMT, which is more cost- 
intensive than SEP [3], tries to reduce the injection drug use. This has two 

^ This research was partly financed by the Austrian Science Foundation (FWF) 
under Contract No. P14060-OEK (“Dynamics and Control of Illicit Drug 
Consumption” ) . 




42 



results: on the one hand, it lowers the spread of HCV and HIV because it 
lowers the participants of needle sharing groups, and on the other hand, it 
can also reduce side effects of illicit drug use like drug-related crime. 

Pollack [8,9] provides an explicit epidemiological model to explore the 
impact of substance abuse treatment on the incidence and prevalence of HIV 
and HCV. Studying the cost-eflFectiveness of MMT, Pollack computes the 
(average and marginal) cost per infection averted. Although his approach is 
inter-temporal, the number of treatment slots, M, may not vary over time 
in his model. More precisely, he minimizes the cost per averted infection by 
comparing various values of M. 

The purpose of the present paper is to extend Pollack’s analysis to a 
dynamic cost-effectiveness analysis in which the number of treatment slots 
M acts as a time-dependent control variable. The performance functional 
contains (at least) two terms, i.e. the social costs (damage) created by the 
incidence of new HIV/HCV cases and the costs of MMT. Optimal control 
theory (see, e.g., [4]) provides a useful tool to investigate such inter-temporal 
trade-offs (between damages and costs). 

2 Model Formulation 

We start with presenting the dynamic equations Pollack used in his paper 
[ 8 ]. Let (5, /i, and 7 represent the exit rate from the active IDU population, 
the exit rate from treatment, and the permanent cure rate of treatment, 
respectively. Then the dynamics of the IDU population {N{t)) are given by 

N{t) =9- (N(t) -M)S- Mii-i, (1) 

with initial condition N{0) = Nq. 6 is the exogenous inflow of drug users, 
M is the number of users under methadone treatment, (N{t) — M) S is the 
outflow of users not in MMT, and M/x 7 is the outflow of users getting MMT. 
Note that 

M7 > (2) 

i.e. the desistance rate is higher for users in treatment. 

According to Pollack [ 8 ], the number of infected IDUs, /(t), evolves over 
time as follows (time argument t is omitted partly in what follows to increase 
readability) : 

/ = (l - f ) (-«) + (1 - f ) (/?.. - /) i - .7^/ ( 3 ) 

with initial condition 7(0) = Jo- Here, k, 6 , and denote the infectivity, 
the arrival rate into shooting galleries, and the proportion of shooting gallery 
participants, respectively. (Shooting galleries are places where IDUs meet 
and share needles.) (l - ^) {-SI) is the outflow of infected users getting no 
MMT. (l — ^) {ON — 7) axe the candidate users to be infected, while 





43 



is the probability of sharing a needle with an infected user. Finally, /i 7^7 is 
the outflow of infected users receiving MMT. 

In this work we assume that N = N is constant while M = M(t) is a 
time-dependent control variable. We change the state and control variables 
to proportions J — and U — Hence, the new state equation is 

fj )] , (4) 

where a = b = /cA, and / = «A/i?. 

The objective function has the form 

pOO 

max -N / e"’’* [NU"^ + vJ{\- U) (b - fJ)] dt, (5) 

^ Jo 

which represents the discounted stream of treatment costs and social costs 
of the new infected ID Us, with r and u denoting the discount rate and the 
lifetime social cost per new infected user, respectively. 

In order to solve this optimal control problem the maximum principle 
according to [4] is applied to the model equations. 

3 Specification of Parameters 

It is assumed that IDUs leave the population at random with the constant 
rate 6 per person per time unit. Thus, the drug career length is exponentially 
distributed with mean duration | of 4000 days (or about 11 years). This exit 
rate is also assumed to be independent of the disease status, which seems 
reasonable because of the long, largely asymptomatic post-infection period 
for both HIV and HCV. 

To simplify the complex pattern of drug-using behavior, we assume that 
30% of IDUs participate in needle-sharing (i? = 0.3). The remaining 70% do 
not face any disease risk. The shooting gallery participants have a constant 
arrival rate of A = 1/7 days (i.e. once per week). 

The infectivity rate k, which describes the probability of a disease trans- 
mission when an uninfected person shares a needle with an infected IDU, is 
quite different for HIV and HCV. Hospital accidents data (cf. section 1 ) sug- 
gest that HCV is easily spread through syringes and other sources. In one of 
these studies ([1]), an infectivity rate between 3% and 9% was observed. In- 
fectivity under realistic needle sharing conditions among IDUs is not known. 
Hence, ac is assumed to be 0.04. HIV has a much smaller infectivity rate of 
about 1 %. 

The mean treatment length ^ is assumed to be 400 days (about 1.1 years). 
Furthermore, it is assumed that the exit rate from injection drug use due to 
treatment 7 is 0.75. For the discount rate r we assume a value of 3% per year 
or per day. 

To estimate the lifetime social costs of one new infection, u, we use the 
lifetime expected treatment costs, which is clearly an underestimate. In the 





44 



case of HIV, Holtgrave and Pinkerton [6] give a lifetime discounted present 
value of $195000. Wong [10] puts the lifetime treatment costs of HCV at 
$19900 for conservative treatment. 

4 Results 

Analyzing the canonical system resulting from the maximum principle with 
the parameters for HIV leads to the following result: Independent of the initial 
situation it is always optimal to approach a stable equilibrium with nearly no 
infected IDUs and about 38% of the IDU population under treatment. For 
details see figure 1. 

u 



0.8 

0.6 

0.4 

0.2 



0.2 0.4 0.6 0.8 1 




Fig. 1. Optimal control for HIV 



In the case of HCV the results are similar, but the level of IDUs under 
treatment necessary to remain in the stable equilibrium is about 74%. This 
is due to the fact that HCV has a much higher infectivity rate. 

Comparing the optimal dynamic control with the case of constant control 
which does not vary over time, the dynamic control reduces the total social 
costs significantly, both for HIV and HCV (see table 1). 

The major outcome of this work can be summarized as follows: 

• It is never optimal to completely eradicate the infection by applying full 
MMT all the time. 

• In most cases it is optimal to move into a stable equilibrium with only a 
few infected people (usually below 0.01% of all IDUs). 





45 



Strategy 


Costs HIV 


Costs HCV 


optimal control 
no control 
full control 

optimal constant control 


4774.35iV 

138975AT 

12163.4iV 

12090.7AT 


8134.15A^ 

17113.7iV 

12163.4A^ 

10946.3AT 



Table 1. Total costs for HIV and HCV for different strategies and initial value 
J = 0.3 



• For a plausible range of the estimated parameters the optimal solution 
does not depend on the initial situation. 

• It is very unlikely that no control at all is optimal. As the sensitivity anal- 
ysis shows, the occurrence of a Dechert-Nishimura-Skiba (cf. [5] threshold 
is only possible for parameter values far from the original estimates (es- 
pecially for the HIV parameters). Furthermore, if no control is applied, 
the number of infected persons tends to a very high level, at which nearly 
all shooting gallery participants are infected. 

• A dynamic control where the number of treatment slots varies over time 
is much better than a constant level of the MMT program. A constant 
control at a high level has comparable effects on the number of infections, 
but the long run costs are significantly higher. 

5 Conclusions and Extensions 

This work uses a random-mixing epidemiological model, and it applies op- 
timal control theory to derive the optimal path of services to maximize the 
net benefits associated with infectious disease control among injection drug 
users. Optimal control theory provides a useful tool to explicitly investigate 
the tradeoffs between protection and program costs. 

Like any formal analysis, this model includes limitations. We use a random- 
mixing model most appropriate for populations with prevalent random shar- 
ing. This random-mixing model can be extended to overlapping subgroups 
or more complex compartment models. Segregated subgroups tend to de- 
press disease incidence by reducing the proportion of “discordant” needle- 
sharing that matches infected and uninfected IDUs. More sociologically com- 
plex models provide a more sophisticated framework to examine the context 
of needle-sharing and infectious disease spread. However, mathematical mod- 
els indicate that the random-mixing model provides a good approximation 
to non-random models when there is even a small degree of overlap across 
sharing networks. 

This analysis focuses on an idealized harm reduction intervention. We do 
not consider many other benefits associated with harm reduction interven- 
tions. Methadone maintenance treatment and best-practice syringe exchange 
programs include diverse components to shorten drug-using careers and to 





46 



otherwise halt or reduce individuals’ injection drug use. Including these ben- 
efits in the intervention would increase our estimates of program effectiveness 
and might also alter our analysis of optimal policy. 

There exist several other possible extensions of the model presented in this 
thesis. One of the main simplifications of the current model is the assumption 
of a constant size of the injection drug users’ population. In particular, the 
optimal MMT program leading to the stable equilibrium takes several years, 
and during this time the IDU population can change very much, especially 
in the presence of a MMT program. A model taking this into account has 
already been presented in section 2. 

We conclude this paper by pointing to another fact, which is completely 
neglected in the current model formulation. Definitely, it is not true that HIV 
and HCV occur separately. Hence, a model which combines both diseases 
would be more realistic. However, such a model would have to deal with at 
least four state variables, which describe the following four subgroups of the 
IDU population: infected with HIV and HCV, infected with HIV but not with 
HCV, infected with HCV but not with HIV, and not infected, respectively. 

References 

1. M. Alter and L. Moyer. The importance of preventing Hepatitis C Virus in- 
fection among injection drug users in the United States. Journal of Acquired 
Immune Deficiency Syndromes and Human Retrovirology, 18(S1):6-10, 1998. 

2. CDC. Recommendations for prevention and control of Hepatitis C Virus (HCV) 
infection and HCV-related chronic disease. Technical Report 47 (RR 19), 
MMWR, 1998. 

3. T. D’Aunno, T. Vaughn, and P. McElroy. An institutional analysis of HIV pre- 
vention efforts by the nations outpatient drug abuse treatment units. Journal 
of Health and Social Behaviour, 40(2):175-92, 1999. 

4. G. Feichtinger and R.F. Hartl. Optimale Kontrolle okonomischer Prozesse. 
Walter de Gruyter, Berlin, 1986. 

5. G. Feichtinger and G. Tragler. Skiba thresholds in optimal control of illicit drug 
use. In G. Zaccour, editor. Optimal Control and Differential Games. Kluwer 
Academic Publisher, Boston, 2002. 

6. D.R. Holtgrave and S.D. Pinkerton. Updates of cost of illness and quality 
of life estimates for use in economic evaluations of HIV prevention programs. 
Journal of Acquired Immune Deficiency Syndromes and Human Retrovirology, 
16(l):54-62, 1997. 

7. lOM. No Time to Lose: Making the Most of HIV Prevention. National Academy 
Press, Washington, DC, 2000. 

8. H.A. Pollack. The cost-effectiveness of methadone in preventing blood-borne 
disease: A comparison of HIV and Hepatitis C. Working Paper, 2000. 

9. H.A. Pollack. Controlling infectious diseases among injection drug users: Learn- 
ing (the right) lessons from HIV. Working Paper, 2001. 

10. J.B. Wong. Cost-effectiveness of treatments for chronic Hepatitis C. American 
Journal of Medicine, 107(6B):74-78, 1999. 





Strategien zur Riistzeitvermeidung in der 
Elektronikfertigung 



Claus-Burkard Bohnlein 

Lehrstuhl fur BWL und Wirtschaftsinformatik, Universitat Wurzburg, 
NeubaustraBe 66, D-97070 Wurzburg, boehnlein@wiinf.uni-wuerzburg.de 



1 Problemumfeld 

Mit der Einfuhrung der Surface Mount Technology wurde die Grundlage fur eine 
hochautomatisierte Fertigungsabwicklung in der Elektronikindustrie geschaffen. 
Durch die Normierung und Miniaturisierung oberflachenmontierter Bauelemente 
(Surface Mounted Device, SMD) kann mit g^gigen SMD-Bestiickungssystemen 
eine Vielzahl unterschiedlicher SMD-Gehauseformen verarbeitet werden (Pawli- 
schek 1991). Dies fuhrte zu einem Wechsel von der werkstatt- zur linienorientier- 
ten Fertigungssteuerung (Schweitzer 1994). 

In den vergangenen Jahren konzentrierte sich die Massenproduktion elektroni- 
scher Giiter in Asien. In den Hochlohnregionen Westeuropas wird dagegen zu- 
nehmend in Kleinserien bzw. kundenauftragsorientiert gefertigt (ZVEI 2001). 
Kleine LosgroBen und haufige Auftragswechsel ffihren aber dazu, dass SMD-Be- 
stiickungen wegen Umrustung zu 30-40 % der Betriebszeit stillstehen und meist 
einen Engpass in der Elektronikfertigung darstellen. Mit den heute verfugbaren 
Planungssystemen kann dieses Problem nicht befriedigend gelost werden (o.V. 
2001 ). 



2 Planung in SMD-Bestuckungen 

Der Bereich der Fertigungsplanung wird wegen seiner verschiedenartigen Prob- 
lemstellungen in mehrere Planungsstufen aufgeteilt. In Anlehnung an FELD- 
MANN, ROTH und ROTHHAWT (Feldmann u. Roth 1991; Feldmann et al. 
1992; Rothhaupt 1995) sowie GUNTHER (Gunther et al. 1996) lassen sich funf 
Stufen unterscheiden, die nachfolgend beschrieben werden (vgl. Abb. 1): 

1. Umriistungsminimale Auftragseinlastung 

Stehen fiir einen Bestuckungsauftrag mit kleiner LosgroBe verschiedene Bestii- 
ckungslinien zur Auswahl, so wird er tendenziell so eingeplant, dass der geringste 
Umriistungsaufwand und gleichzeitig eine Bestiickung mit gutem Line-Balancing 
erreicht werden kann. Es handelt sich hier um ein Zuordnungsproblem, unter Ein- 
haltung sowohl von Lieferterminen als auch von technischen und kapazitiven Re- 




48 



striktionen, mit dem Ziel die Leistung des Gesamtsystems zu maximieren bzw. ci- 
ne moglichst giinstige Kostenstruktur zu erreichen. 

2. Umrustungsminimale Auftragsreihenfolge vor einer Bestuckungslinie 

Durch die Bildung einer umriistungsminimalen Reihenfolge fur die vor der Bestii- 
ckungslinie wartenden Auftrage werden Stillstandzeiten vermieden und die Sys- 
temleistung erhoht. 



Planungsobjekt 

Anlage 



Pianungsstufe 



Linie 



0 



Maschine 



0 



1 


Auftragseinlastung 




2 


Umnistungsininimale 

Auftragsreihenfolge 




3 


Zuordnung der 
Bauelementezuf hrsysteme 




4 


Anordnung der 
Bauelementezuf hrsysteme 




5 


Bildung der 
Best ckungssequenz 



D®©®(A) 



LI : Linie 1 
L2; Linie 2 
LP: Leiterplatte 



■>c 



□ LI 

□ L2 



®©@- 



33 



□ Ll 



L_J 


I.P 




17|02 




- 


- 


ii 



1 






0 


02 




01 


0 



Abb. 1. Planungsstufen in SMD-Bestuckungen 



3. Optimale Zuordnung der Bauteilforderer zu den Bestiickungssystemen 

Sind in einer Bestuckungslinie mehr als ein Bestuckungssystem eingesetzt, dann 
muss der Auftrag aufgeteilt werden, indem die Bauteiltypen, die auf der Leiter- 
platte zu setzen sind, einzelnen Bestiickungssystemen zugeordnet werden. Dabei 
sind einerseits technische Restriktionen zu beachten und im Hinblick auf eine um- 
rustungsminimale Zuordnung sollte andererseits die Ausgangsrixstung moglichst 
wenig verandert werden. 

4. Anordnung der Bauteilforderer an den Bestuckungssystemen 

Die unter 3. getroffene Zuordnung wird hier fur jedes Bestuckungssystem weiter 
verfeinert. Stand bisher die Reduzierung des Rustaufwands im Vordergrund, so 
wird jetzt versucht, durch eine giinstige Platzierung der Bauteilforderer an den 
einzelnen Bestuckungssystemen der Linie die Bestiickungsdauer fiir den Auftrag 
zu minimieren. Dies geschieht in dem z. B. haufig benotigte Bauelemente in der 
Nahe des Bestiickungstisches platziert werden, um kurze Verfahrwege des Bestii- 
ckungskopfes zwischen der Abholposition eines Bauelements am Forderer und der 
Zielposition auf der Leiterplatte zu erreichen. 





49 



5. Optimierung der Bestiickungssequenz 

Auf der Basis der Planungsschritte 3. und 4. wird fur jeden Auftrag eine giinstigste 
Bestiickungssequenz gebildet und als Bestiickungsprogramm gespeichert. Ziel ist 
es fiir jeden Bestiickungskopf einer Linie den zeitlich kiirzesten Gesamtverfahr- 
weg fur den jeweiligen Auftrag zu ermitteln und so die Bestiickungsdauer zu mi- 
nimieren. 



3 SIMOS Riistkonzept fiir die Kleinserienfertigung 

In der betrieblichen Praxis liegen die realen mittleren Bestiickungsleistungen deut- 
lich unter den Werten der technischen Systemspezifikation, insbesondere wenn 
kleine LosgroBen gefertigt werden. Die realen LeistungseinbuBen sind im wesent- 
lichen auf zwei Bereiche zuruckzufiihren, zum einen auf Stillstandszeiten wegen 
Umriistung, Storungen etc. und zum anderen auf ein schlechtes Line-Balancing, 
d. h. eine ungleichmaBige Austaktung der Bestiickungseinheiten. Dies hat unter- 
schiedliche Ursachen: 

□ In der Praxis wird fiir jeden neuen Bestiickungsauftrag die Riistung der Bestii- 
ckungslinie angepasst. Die Ziele Reduzierung des Rustaufwands einerseits und 
Verbesserung des Line-Balancing und damit der Bestiickungsdauer fiir den 
Auftrag andererseits sind dann gegenlaufig. Als Konsequenz werden z. B. Riist- 
schritte zur Verbesserung des Line-Balancing bei kleinen LosgroBen nicht 
durchgefiihrt, wenn der dafiir erforderliche Aufwand durch Einsparungen in der 
Bestiickungsphase nicht kompensiert werden kann. 

□ Die gegenseitige Abh^gigkeit der einzelnen Planungsstufen fiihrt bei einer se- 
quentiellen Bearbeitung der einzelnen Planungsschritte meist nicht zu einem 
Gesamtoptimum. 

□ Konzepte wie Fest- bzw. Mehrfachriistungen einzelner Bauteiltypen sind in der 
Literatur zwar bekannt, werden wegen ungeeigneter Riisttaktiken und Werk- 
zeuge in der Variantenfertigung aber nicht konsequent eingesetzt. 

Das SIMOS-Riistkonzept (SImultane MatrixOrientierte Stiicklistenauflosung) ba- 
siert auf der Grundiiberlegung, die Einmaligkeit eines Fertigungsauftrags in der 
SMD-Bestiickung in Frage zu stellen. Wird der Planungshorizont nur hinreichend 
erweitert, dann ergibt sich ein weitgehend stabiler Auftragsmix mit einem ebenso 
stabilen Bauteiltypenspektrum. Dies ist die Ausgangssituation fiir die Ermittlung 
dauerhafter Riistungen, sog. Festriistungen. In Praxistests haben sich Planungsho- 
rizonte und damit Lebensdauem der berechneten Festriistungen von bis zu mehre- 
ren Monaten bewahrt. 

In SMD-Bestiickungen mit Kleinserienfertigung gilt erfahrungsgemaB, dass ca. 
80% aller gesetzten Bauelemente (nach Stiickzahl) in weniger als 20% aller Lei- 
terplattentypen verbaut werden. Zudem haben Analysen ergeben, dass einzelne 
Bauteiltypen in sehr groBer Anzahl und in der Mehrheit aller Bestiickungspro- 
gramme benotigt werden. Als Faustregel der bisherigen Ergebnisse gilt, dass ca. 
20% aller gesetzten Bauelemente einer Planungsperiode auf nur 1% des Bauteilty- 
penspektrums entfallen. Diese Bauteiltypen werden im SIMOS Planungswerkzeug 
gezielt ermittelt, mehrfach auf der Bestiickungslinie geriistet und durch Modifika- 





50 



tion der Bestuckungsprogramme fiir die Verbesserung des Line-Balancing ge- 
nutzt. 

Mit dem Ziel die Bestiickungsleistung bei Kleinserienfertigung zu steigem, 
wurden im Forschungsprojekt SIMOS verschiedene Rusttaktiken entwickelt, im- 
plementiert und getestet. Bei alien Taktiken wird durch geeignete MaBnahmen 
versucht die mittlere Riistzeit pro Auftrag zu verkurzen ohne das Line-Balancing 
nennenswert zu verschlechtem. Als Konsequenz kann einerseits wegen reduzierter 
Stillstandszeiten die Systemleistung erhoht und andererseits die eingesparte Riist- 
zeit zur schnelleren Materialbereitstellung, Behebung von Storungen und fur War- 
tungsarbeiten etc. genutzt werden. Dies fuhrt zu einer Verbesserung der Produkti- 
onsabwicklung in der Bestuckungslinie (Bohnlein 2001). 

Fur die Einfuhrung des SIMOS Riistkonzepts in SMD-Bestiickungen wurde ein 
geeignetes Vorgehensmodell mit sechs Phasen entwickelt (vgl. Abb. 2). 



Phase 1 



Phase 2 



Phase 3 



Phase 4 



Phase 5 



Phase 6 



Auftrags- 




Produkt- 




Produktions- 


strukturanalyse 




strukturanalyse 




prozessanalyse 





1 


y 



Bestimmung der effizienten 
Riistkapazitat in der Linie 



I 

Anpassimg der 
Linienstruktur 

t ■ 

Berechnung der 
SIMOS-Riistungen 

I ^ 

Systemintegration und 
Prozessanpassung 

V 

Produktentwicklung mit 
Voizugsbauelementtypen 



Abb. 2. Vorgehensmodell fur die SIMOS-Einfuhrung 

In Phase 1 werden eine Auftrags-, Produktstruktur- und Produktionsprozessanaly- 
se durchgefuhrt, um den Spielraum fur AnpassungsmaBnahmen auszuloten. Auf 
der Basis der Analyseergebnisse werden in den Phasen 2 und 3 die Linienkonfigu- 
ration unter Riistaspekten untersucht und angepasst. In Phase 4 werden nach MaB- 
gabe der Auftrags- und Produktstruktur geeignete Riistungen berechnet. Zum Ab- 
schluss der SIMOS-Einfuhrung werden in der Phase 5 die SIMOS-Werkzeuge in 
die bestehende Systemwelt eingebunden. Die Phase 6 geht fiber eine reine Sys- 
temeinfiihrung hinaus, denn durch die Versorgung der Abteilungen Auftragsvor- 
bereitung, Entwicklung, Materialwirtschaft und Vertrieb mit SIMOS-Informa- 





51 



tionen werden wichtige Erkenntnisse aus der Fertigung in diesen Abteilungen 
nutzbringend fur das Untemehmen eingesetzt. 



4 Ergebnisse aus Praxiseinsatzen 

Das SIMOS-Riistkonzept wird seit 1998 in einer industriellen Fertigung produktiv 
zur Ermittlung von Festriistungen eingesetzt und wurde im Zeitraum 07/1998 bis 
03/2000 wissenschaftlich im Produktivbetrieb begleitet. Nach einer Anpassung 
des Linienlayouts (Phase 3: Erweiterung von 2 auf 3 Bestiickungsautomaten pro 
Linie) konnten insbesondere in der Unterseitenbestiickung deutliche Verbesserun- 
gen erzielt werden. Die Lebensdauer der bislang eingesetzten SIMOS Festrustun- 
gen liegt zwischen 3 und 6 Monaten, wobei neue Leiterplattentypen in bestehende 
Festriistungen aufgenommen werden konnen ohne bereits bestehende Riistzuord- 
nungen zu verandem. Von den pro Halbjahr produzierten Bestiickungsprogramm- 
varianten konnen 50% in der Unterseitenbestiickung in einer Festriistung beriick- 
sichtigt werden und decken 90% aller auf dieser Linie bestiickten Bauelemente ab 
(Bohnlein 1999). 

In der Unterseitenbestiickung konnte von Juli 1998 bis Juli 1999 die durch- 
schnittliche, umriistungsbedingte Stillstandszeit bei einem Auftragswechsel um 
60% und in der Oberseitenbestiickung im gleichen Zeitraum um mehr als 40% ge- 
senkt werden. Dadurch wurden im 3-Schicht-Betrieb Riistzeiten im Umfang von 
mehreren Stunden pro Tag und Bestiickungslinie eingespart. 

Im Beobachtungszeitraum hat sich durch die Einfiihrung einer neuen Bauele- 
mentreihe das monatlich gefertigte Produktspektrum in der Unterseitenbestiickung 
um 100 % und in der Oberseitenbestiickung um 50% erhoht. Dadurch waren mehr 
Umriistungen erforderlich, die die genannten Einsparungen weitgehend kompen- 
sierten und nur einen geringen Zuwachs der nominellen Bestiickungsleistung er- 
moglichten. Insgesamt konnte aber die geplante Beschaffimg einer weiteren Be- 
stiickungslinie um zwei Jahre verschoben werden (Bohnlein 2002). 

Das Untemehmen setzt das SIMOS-Rustkonzept auch zwei Jahre nach Ab- 
schluss des Kooperationsprojekts immer noch taglich produktiv ein. 



Literatur 

Bohnlein C (1999) Potential und Einsatz von SIMOS-Festrustungen in der SMD-Be- 
stiickung. In: Schlecht M (Hrsg.) SMT Surface Mount Technology, EMP, Boblingen, 
S 22-28 

Bohnlein C (2001) Verfahren zur Auffiistung der Bestiickungsautomaten einer Bestii- 
ckungslinie fur ein Mix unterschiedlicher Leiterplattentypen. Deutsches Patent und 
Markenamt, Patent DEI 98 34 620 vom 31.05.2001, Miinchen 
Bohnlein C (2002) Riistzeitvermeidung in der Elektronikfertigung. DUV, Wiesbaden 
Feldmann K, Roth N (1991) Optimization of Set-up Strategies for Operating Automated 
SMT Assembly Lines. Annals of the CIRP, Vol. 40/1/1991 





52 



Feldmann K et al. (1992) Optimale Riist- und Umriiststrategien steigem die Produktivitat. 
In: Leiterplattentechnik (Supplement zu F&M, MO, QZ), 05/1992. Carl Hanser Ver- 
lag, LP46-LP50 

Gunther H-0 et al. (1996) A Heuristic for Component Switching on SMD Placement Ma- 
chines. Wirtschaftswissenschaftliche Dokumentation der TU Berlin, Fachbereich 14. 
Berlin 

o. V. (2001) Effizientes SMD-Bestiicken. Informationsabfrage vom 14.9.2001. In: 
http://www.mimot.de/produkte/wirtschaflich_konzept/wirtschaftlich/wirtschaftlich.htm 

Pawlischek H (1991) SMT-Bestiickungstechniken. In: Herrmann G (Hrsg.) Handbuch der 
Leiterplattentechnik, Neue Verfahren, Neue Technologien. Leuze, Saulgau, S 191-228 

Rothhaupt A (1995) Modulares Planungssystem zur Optimierung der Elektronikfertigung. 
Carl Hanser Verlag, Miinchen 

Schweitzer M (1994) Industriebetriebslehre - Das Wirtschaften in Industrieuntemehmun- 
gen. 2. Aufl., Vahlen, Miinchen 

ZVEI (2001) Elektroindustrie meldet zweistelliges Wachstum fur 2000. Zentralverband E- 
lektrotechnik- und Elektronikindustrie e. V. In: http: //www.zvei.de/news/Presseinfor- 
mationen/2000-1 2/Prl 1 5-2000.htm 





Optimale Belegung von StranggieBanlagen 
mitteis 2-dimensionaler Bin-Packing-Modelle 



Thomas Spengler / Oliver Seefried' 

Technische Universitat Braunschweig, Institut fur Wirtschaftswissenschaften, 
Abteilung BWL, insb. Produktionswirtschaft, 

Katharinenstr. 3, 38106 Braunschweig, 

Tel.: 0531/391-2201, Fax: 0531/391-2203, 
e-mail: t.spengler@tu-bs.de, o.seefried@tu-bs.de 



Abstract. In integrierten Hiittenwerken wird in der Produktionsstufe Stranggiefien 
Rohstahl, der in der Konvertermetallurgie in deterministischen Pfannenmengen 
produziert wurde, zu kontinuierlichen Strangen vergossen. Im vorliegenden Bei- 
trag wird das Problem der Belegung einer StranggieBanlage als zweidimensionales 
Bin-Packing-Problem mit variabler Stranglange und variabler Strangbreite model- 
liert. Als zu minimierende Zielfunktion wird hierbei auf die Anlagenleis- 
tung/GieBzeit (und damit aufgrund der konstanten GieBgeschwindigkeit die 
Stranglange) zuriickgegriffen. Es wird ein Losungsverfahren vorgestellt, das mit 
kommerzieller OR-Software auf einem Standard-PC implementiert in der Lage ist, 
fur praxisrelevante ProblemgroBen gute Losungen im Minutenbereich zu generie- 
ren. 



1 Einleitung und Problembeschreibung 

Ein maBgeblicher Produktionsschritt bei der Herstellung von Rohstahl im Rahmen 
eines integrierten Hiittenwerksprozesses stellt das StranggieBen dar, bei dem fliis- 
siger Rohstahl, der in der Konvertermetallurgie in einem Frischprozess auf die be- 
notigten Qualitatseigenschaften eingestellt wurde, in den fur die Weiterverarbei- 
tung auf unterschiedlichen Walzwerkstufen notwendigen festen Aggregatszustand 
uberfuhrt wird. Ausfuhrliche Prozessbeschreibungen des StranggieBens finden 
sich u.a. bei Tanner (1997) und V.d.E.H. (1999). 

Nachdem der GieBvorgang abgeschlossen ist, wird der erstarrte Stahlstrang zu- 
nachst durch einen Guillotineschnitt quer in Sektionen der vollen Strangbreite un- 
terteilt. In der Regel erfolgt in einem zweiten Schneidevorgang eine Teilung l^gs 
zur Flussrichtung, um die zuvor abgetrennten Abschnitte in Brammen unter- 
schiedlicher Breite zu unterteilen. Die Produktionsleistung einer StranggieBanlage 
kann daran gemessen werden, welche Zeit benotigt wird, ein vorgegebenes Sorti- 



^ Die Autoren danken den beteiligten Mitarbeitem der Salzgitter Flachstahl GmbH fiir ihre 
Unterstiitzung bei der Durchfiihrung der zugrunde liegenden Projektarbeiten. 




54 



ment an Fertigungsauftragen abzugieiJen.^ Da die GieBgeschwindigkeit als einer 
der darauf einwirkenden Einflussfaktoren in der Regel von den Analysespezifika- 
tionen der herzustellenden Giite beschrankt wird, kann eine Minimierung der be- 
notigten Stranglange als synonym zur Maximierung der Anlagenleistimg angese- 
hen werden. Hierbei wird die abgegossene Strangl^ge aufgrund der 
deterministischen SchmelzengroBen in der Konvertermetallurgie und der iiber den 
gesamten GieBvorgang konstanten Stranghohe durch die beiden Aktionsparameter 
Strangbreite und Anzahl der benotigten Schmelzen eindeutig beschrieben. 

Wahrend die maximale GieBbreite (bmax) der StranggieBanlage die zu bestim- 
mende Strangbreite technisch nach oben begrenzt, ergibt sich eine untere Grenze 
in Abhangigkeit davon, mit welchen Schnittmustem der Strang in die einzelnen 
Brammen zerlegt wird. Je nach Verteilung der Branunen auf die Sektionen der 
Sequenz ergeben sich unterschiedliche Sektionsbreiten, aus deren Maximalwert 
sich die auftragsgebundene Breite (ba) der Sequenz errechnen lasst. 

Dabei treten bei heterogenen Brammenabmessungen im Regelfall zwei Arten 
von Verschnitt auf: 

• Zwischenverschnitt (ZV), der innerhalb der Packung der Brammen anfallt, 

• „Schmelzenverschnitt“, der darauf zuriickzufuhren ist, dass die diskreten 
SchmelzengroBen in der Konvertermetallurgie nicht mit der Summe der Bram- 
mengewichte ubereinstimmen, imd der entweder in Form von Langenverschnitt 
(LV) Oder Breitenverschnitt (BV) auftritt. 

Zur Minimierung der abgegossenen Stranglange und somit der benotigten GieB- 
zeit bietet es sich an, die einzelnen Sequenzen moglichst breit abzugieBen und ei- 
nen etwaigen Langenverschnitt in einen Breitenverschnitt zu uberffflhren. Mit die- 
ser Strategic ist jedoch die Gefahr verbunden, zusatzliche Schmelzen zur 
Bereitstellung des Brammenkollektivs zu benotigen. Dieser Zielkonflikt und die 
Interdependenzen der EntscheidungsgroBen Strangl^ge, Strangbreite und Anzahl 
n der vergossenen Schmelzen werden in Abbildung 1 veranschaulicht: 




Abb. 1. Interdependenzen zwischen Brammenpositionierung und Sequenzparametem 

Die Darstellung verdeutlicht, dass ein AbgieBen mit maximaler Breite nicht au- 
tomatisch zur Reduzierung der Sequenzlange fiihren muss. Ausgehend von einer 
Brammenpositionierung, welche die auftragsgebundene Stranglange la minimiert, 
beschrankt sich die Wahl der Sequenzparameter auf die weiBe Flache rechts ober- 



^ Stillstandszeiten durch Riistvorgange, die entweder durch Breiten- bzw. Qualitatsdifferen- 
zen nacheinander abgegossener Sequenzen bedingt sein konnen (Spengler/Seefried/Kock 
(2001)), haben ebenfalls Einfluss auf die Anlagenleistung. Sie werden jedoch in den fol- 
genden Ausfuhrungen nicht weiter beriicksichtigt. 






55 



halb des Punktes P im Diagramm. Es wird deutlich, dass fur n = 1 Schmelze keine 
Breiten-Langen-Kombination existiert, mit der das geforderte Brammenkollektiv 
abgegossen werde konnte. Fiir n = 2 und n = 3 Schmelzen hingegen ergeben sich 
zulassige Kombinationsmoglichkeiten, von denen beispielhaft drei Randpunkte 
hervorgehoben sind: 

(1) Wird mit der minimal moglichen, auftragsgebundenen Breite (ba) abgegossen, 
ergibt sich bei zwei Schmelzen ein Langeniiberschuss. 

(2) Eine geringfiigige Erhohung der Strangbreite fuhrt zu einer Verringerung der 
Strangl^ge auf ls 2 , wobei der L^geniiberschuss eliminiert wird und in einen 
Breitenixberschuss iibergeht. 

(3) Die Verbreiterung der Sequenz bis zu ihrer oberen Grenze fuhrt zu einer Erho- 
hung der GieBzeit, da die Funktion fur n = 2 Schmelzen unzulassig wird. 

Wird beim betrachteten Planungsproblem auf eine differenzierte okonomische 
Bewertung der verschiedenen Verschnittarten verzichtet und die Stranglange als 
zu minimierende Ersatzzielfunktion betrachtet, kann es in die Typologie von 
Dyckhoff/Finke (1992) als zweidimensionales Bin-Packing-Problem eingeordnet 
werden, bei dem die zusatzlich zu beriicksichtigenden technisch bedingten Re- 
striktionen deterministischer SchmelzgroBen zu „Containem“ mit variabler Seiten- 
lange und -breite, jedoch konstanter Grundflache fuhren. Umfassende Uberblicke 
liber Forschungsarbeiten im Bereich der Verschnittminimierung finden sich u.a. 
bei Dyckhoff / Finke (1992) und Dyckhoff / Scheithauer / Temo (1995). 

Zur Losung soil im Folgenden ein Optimierungsmodell formuliert werden, das 
in der Lage ist, die folgenden Fragestellungen zu beantworten: 

• An welcher Position liegen die Brammen innerhalb der Sequenz? 

• Wie viele Schmelzen werden ffir das AbgieBen einer Sequenz benotigt? 

• Wie breit ist der Strang auf der Anlage einzustellen? 



2 Modellformulierung und Losungsansatz 

Bei der Formulierung des Entscheidungsmodells werden folgende Annahmen zu- 

grunde gelegt: 

• Alle in der Planungsperiode herzustellenden Brammen einer Giite werden in 
nur einer Sequenz gefertigt. 

• Die Strangbreite, mit welcher der Stahl abgegossen wird, soil innerhalb dieser 
Sequenz nicht variiert werden. 

• Das Gewicht aller Schmelzen einer Giite ist identisch und kann aufgrund der 
konstanten Stranghohe und Rohstahldichte in eine Flache iiberfiihrt werden. 

• Anfallende Verschnittmengen werden nicht differenziert bewertet. 

• Bei der Anordnung der Brammen konnen hochstens zwei Brammen nebenein- 
ander in einer Sektion liegen^ 



^ Diese einschrankende Annahme ist darauf zuriickzufiihren, dass im Referenzuntemehmen 
Salzgitter Flachstahl GmbH aufgrund der Geometrien der nachfolgenden Walzwerkstu- 
fen Brammen nur in Breiten zwischen 1000 und 2600mm weiterverarbeitet werden 
(Preussag Stahl (Hrsg.) (1993)). Bei einer maximalen Giefibreite der untersuchten 





56 



Indizes: 

i, p = 1, m Brammen 

Variablen: 

f 1 falls Bramme i neben einer nicht langeren Bramme p angeordnet wird 
|o sonst . 



Daten: 

1: Langen der Brammen [mm] 
b: Breiten der Brammen [mm] 
bmax: maximale GieBbreite der StranggieBanlage [mm] 



m m 



Min 


i=l p=l 








(1) 


u.d.N. 


m m 

p=l p=l 


furi = 1, 


m 




(2) 




Xip li^Xiplp 


furi = 1,, 


m; p = 1, , 


m 


(3) 




Xip(bi+bp)<bmax 


furi = 1,. 


m; p = 1, 


m 


(4) 




xfpe{0,l} 


furi = 1, 


.... m; p = 1, , 


m 


(5) 



Damit ergibt sich fur jede Giite/Sequenz in Abhangigkeit der darin enthaltenen 
Brammenzahl m ein lineares binares Optimierungsproblem mit m^ Entschei- 
dungsvariablen und 2m^+m (linearen) Nebenbedingungen, das mit dem OR- 
Softwarepaket LINGO 6.0 auf einem PC (Pentium III - Prozessor mit einer Takt- 
frequenz von 600 MHz) implementiert bis zu GroBenordnungen von m = 80 im 
Sekunden- bzw. Minutenbereich losbar ist. 

Die Frage, ob mit der Minimierung der auftragsgebundenen Lange la auch die 
Losung des eigentlichen Planungsproblems, eine Minimierung der Sequenzlange 
Is, einhergeht, soil anhand von Abbildung 2 veranschaulicht werden. 




Abb. 2. Optimalitatsanalyse eines heuristischen Losungsansatzes 

Ausgehend von der technisch maximalen GieBbreite wird mittels des Optimie- 
rungsmodells (1) - (5) eine Brammenpositionierung mit minimaler auftragsge- 
bundener Lange ermittelt (Punkt 1). Wird mit dieser Brammenanordnung die mi- 



Anlage von 2650mm ist der Fall, mehr als zwei Brammen nebeneinander anzuordnen, 
somit ausgeschlossen. 






57 



nimale SchmelzenanzahP n_min erreicht, ist eine optimale Losung des Planungs- 
problems gefiinden. Andemfalls kaim durch eine Reduktion der maximalen 
Strangbreite der zulassige Losungsraum rechts unterhalb des zuletzt gefundenen 
Ausgangspunkts sukzessive eingeschrankt werden. Gelingt es im Verlauf dieses 
iterativen Vorgehens, eine Brammenanordnung unterhalb der Isoquante fur n_min 
Schmelzen zu fmden (Punkt 2), wird die optimale Losung durch vertikale Trans- 
formation auf die entsprechende Isoquante ermittelt. Ist dies nicht moglich, ergibt 
sich die optimale Losung im Punkt 3. 

Aufbauend auf diesen Uberlegungen wird im Zusammenspiel einer Ex- 
ceL^VBA-Anwendung mit LINGO ein heuristischer Losungsansatz implementiert. 
Falls das Verfahren bei der Minimierung der auftragsgebundenen Sequenzlange 
aufgrund einer Rechenzeitiiberschreitung^ bzw. Speicherplatzmangels abbricht, 
ohne eine Losung generiert zu haben, wird auf die Verwendung einer einfachen 
Prioritatsregel zuruckgegriffen: Wahle die langste noch einzuplanende Bramme 
und weise ihr die langste Bramme der Liste als Partnerbramme zu, deren Breite 
die verbleibende Restbreite der Sektion nicht uberschreitet. 



3 Fallstudie 

Zur Validierung des in Abschnitt 2 entwickelten Optimierangsmodells wird es auf 
einen Testdatensatz der Salzgitter Flachstahl GmbH angewendet, in dem sich 
3243 Brammen (59 Giiten) mit Sollbreiten zwischen 1000 und 2580mm und L^- 
gen von 9 bis 12m befinden^. Die Anzahl der pro Stahlgiite im Datensatz enthalte- 
nen Brammen variiert dabei zwischen 8 und 714. Insgesamt sind im Testdatensatz 
ca. 1200 Giiten- Abmessungs-Kombinationen zu finden, was fiir eine starke Hete- 
rogenitat des Auftragsspektrums spricht. 

Bei den Optimierungsrechnungen gelingt es, fiir 51 der 59 Sequenzen optimale 
Losungen zu generieren, wobei bei 1 1 Sequenzen im Verlauf des Verfahrens eine 
Reduktion um eine Schmelze erzielt werden kann. Die gesamte Rechenzeit belauft 
sich in etwa auf 8 Stunden, wobei die untersuchten 3243 Brammen einem zwei- 
wochigen Produktionsprogramm entsprechen, so dass das Verfahren im realen 
Einsatz als Zeit unkritisch angesehen werden kann. Insgesamt ergibt sich fiber alle 
Sequenzen ein Verschnittanfall von 15,1%, wovon 5,3% auf Zwischenverschnitt 
innerhalb der Packung der Brammen entfallt. Da der groBere Teil als Langen- 
bzw. Breitenverschnitt anfallt, liegt die Vermutung nahe, bessere Ergebnisse erzie- 
len zu konnen, wenn die Annahme der vollstandigen Einbeziehung aller Brammen 
in eine einzige Sequenz aufgehoben wird. 



^ Das theoretische Minimum an Schmelzen berechnet sich durch Aufrunden des Quotienten 
aus summierten Brammengewichten und mittlerer Schmelzmenge. 

^ Diese wird bei den Testlaufen auf 1 80 Sekunden festgelegt. 

^ Um jeweils zwei Brammen einer Sektion zuweisen zu konnen, werden dem Testdatensatz 
fur alle Giiten Dummy-Brammen mit einer Lange und Breite von 0 mm hinzugefugt, so 
dass insgesamt 3504 Datensatze betrachtet werden. 





58 



4 Ausbiick und weitere Vorgehensweise 

Im vorliegenden Beitrag wird ein heuristisches Losungsverfahren entwickelt, mit 
dessen Hilfe gute Produktionsprogramme ffir ein bestehendes Referenzaggregat 
geplant werden konnen. 

Znkunftig bleibt zu uberprufen, welche Auswirkungen die zugrunde liegenden 
(teilweise recht restriktiven) Annahmen auf die Losungsqualitat haben. Neben der 
bereits erwahnten Moglichkeit, einzelne Brammen im Planungszeitraum unbe- 
rucksichtigt zu lassen, sollte insbesondere eine Verallgemeinerung auf den Fall, 
dass aufgrund unterschiedlicher Anlagenkonfigurationen mehr als zwei Brammen 
parallel abgegossen werden konnen, erfolgen. 

Daruber hinaus gilt es zu untersuchen, welche Riickkopplungen sich iiber die 
Integration von Stillstandszeiten aufgrund breitenbedingter Riistvorgange beim 
Sequenziibergang auf die bisher erzielten Ergebnisse einstellen. 



Literatur 

Dyckhoff, Finke (1992): „Cutting and Packing in Production and Distribution”, Heidelberg. 

Dyckhoff, Scheithauer, Temo (1997): “Cutting and Packing”, in DelFAmico, Maffioli, 
Martello, Annotated Bibliographies In Combinatorial Optimization, S. 393 - 413, 
Chichester. 

PREUSSAG STAHL AG (Hrsg.) (1993): „Werkstoff Stahl - Herstellung und Verarbei- 
tung“, Salzgitter. 

Spengler, Seefried, Kock (2001): „Planungsmodell zur kostenminimalen Brammenversor- 
gung eines integrierten Huttenwerks“, in: Operations Research Proceedings 2000, S. 
345 - 350. 

Tanner (1997): „Revolution in der Stahlindustrie: Strangguss“, Zurich. 

VEREIN DEUTSCHER EISENHUTTENLEUTE (Hrsg.) (1999): „Stahlfibel“, Diisseldorf. 





Short-term Capacity Planning in Manufacturing 
Companies with a Decentralized Organization 

Peter Letmathe, Institut fur Umwelt- und Technologiemanagement, 

Universitat Bayreuth, 

D-95440 Bayreuth 

Abstract: Bottleneck stations have a production rate lower than the demand rate 
for parts they are producing. Therefore they limit the production system’s output 
rate. Short-term capacity planning’s purpose is to reduce the bottlenecks’ negative 
impact on the output rate. This paper discusses several opportunities to adjust ca- 
pacity to the system’s requirements in a decentralized production organization. 
Since these opportunities are related to additional costs, it is crucial to find a cost- 
efficient solution of this problem. For this purpose, a linear model is proposed 
which can be used to determine the optimal mix of short-term capacity adjustment 
measures. The model can be recalculated easily to take the actual output and capa- 
city requirements into account if any deviations from the original plan occur. 



1. starting point 

This paper considers a decentralized job shop production divided in several orga- 
nizational units (production stations). Each station has a given workload of diffe- 
rent tasks. Every task has to be performed within a defined time window which 
comprises the task’s processing time including the average setup time per unit, 
and a time buffer. The size of the buffers determine the station’s scope of chan- 
ging the tasks order, of assigning them to different machines and scheduling them 
at different times. Since the master production schedule (MPS) might lead to a 
workload above the stations’ standard capacity, the production stations are some- 
times confronted with bottleneck situations leading to the need of exceeding stan- 
dard capacity. For this purpose, the production stations can use several opportuni- 
ties of short-term capacity adjustment combined with the linear planning model 
described in the following sections (see Letmathe, 2002). 



2. Opportunities of short-term capacity adjustment 

In case of bottleneck situations, production stations have several opportunities to 

adjust their capacity to actual demand (Gutenberg, 1983): 

• The station may perform tasks in advance which means that parts are not pro- 
duced in the same period they are needed. 

• The station may work overtime which means that actual working hours are 
higher than standard working hours per week. 

• The production rate of one ore more machines may be increased to perform 
more tasks than production with cost-optimal production rate. 

• The number of machine setups may be reduced. The saved time is capacity 
which can be used to produce more parts. 




60 



• The station may solve the bottleneck problem in concert with other production 
stations. The mutual coordination of production stations may help relax tempo- 
rary bottleneck situations by extending time buffers of bottleneck stations. 

• Some of the tasks may be performed by external suppliers which leads to a re- 
duced demand for a bottleneck station’s capacity and therefore loosens the bot- 
tleneck’s confining impact. 



3. Linear model 



All these measures to adjust capacity are related to additional costs which should 
be minimized by the production stations. Since this is a problem of high complex- 
ity, the stations need planning methods to find the cost-optimal solution. In many 
cases, an optimal solution indicates to use several measures of capacity adjustment 
simultaneously. 



The goal function calculates direct capacity adjustment costs by multiplying direct 
adjustment cost per capacity unit of each measure with the total additional capac- 
ity gained through this measure. The sum of capacity adjustment costs of all 
measures leads to the total capacity adjustment costs during the planning horizon: 



N T 
n=lr=l| 



A 1 

•Rnr+ icn"’ 

m=l 



R ^m _LpS 

nr 



13 S 

^nr 



Kic 

T=1 



R? 



► min! 



with: 



K Total capacity adjustment costs 

c n Cost per additional capacity unit gained through performing task 
n in advance by one period 

R Additional capacity through production in advance weighted 
with t - r periods if task n is produced in period r 

c{/" Cost per additional capacity unit if task n is produced with pro- 
duction rate ^ 

R Additional capacity through performing task n with production 
rate in period r 

c„ Cost per additional capacity unit in case of external performance 
of task n 

R Virtual capacity gain through external performance of task n in 
period T 

c^ Costs per time unit above standard working time 

R j Additional capacity through overtime in period T 

T Period of production 

t Period of demand 



* is a production rate index and does not reflect the production rate itself 





61 



T Planning horizon 

This goal function includes only capacity adjustment measures which cause direct 
costs. Other measures like the reduction of the number of machine setups, shifting 
of time windows and informal agreements with other stations are considered in the 
model’s constraints. E.g. the reduction of machine setups increases the capacity 
available to perform tasks in the affected periods. Coordination with other stations 
is reflected through shifted time windows for one or several tasks. This means that 
team members of production stations will make decisions in advance which are 
exogenous to the model. The result is a linear model without integer variables 
which can be solved easily. To find the optimal solution several constraints have 
to be taken into account: 

1. Satisfaction of current demand 

All tasks to be performed can be derived from the plant’s master production 
schedule. To ensure that tasks assigned to the station are performed, the demand 
X for task n in t has to be equal or smaller than the available amount of task n in 
t. The available amount of task n is calculated through the externally performed 

c 

quantity yj^^ of task n in t plus all quantities y^^ produced internally in the peri- 
ods r = 1, . . . , t to serve demand in period t. This constraint has to be fulfilled in all 
periods t = l,...,T and for all tasks n = l,...,N: 
t . 

Synzt+ynt -^nt n = , t = 

T=1 



2. Calculation of production volume 

The output quantity of task n produced in period r to meet the demand in period t 
result from the sum of the performed tasks over all possible production rates 
( m = 1, . . . , M ) with whom task n can be produced: 



Mn j 

yntt = I 
m=l 



n = l,...,N, r = l,...,T, t = l,...,T, r<t 



The total amount of task n performed in period r is calculated through the sum of 
Yntt over all periods t = r,...,T. 



T 

Ynr “ X Ynrt 



t=T 



n = l,...,N, r = l,...,T 



3. Capacity constraints 

The model has to ensure that the available capacity of every period is sufficient to 
perform all tasks assigned to the bottleneck station. The available capacity results 
from the sum of the period’s standard capacity plus the capacity gained 

through overtime r!^ and adjustment of production rates R^ . The required ca- 
pacity is calculated as the number of machine setups period r multiplied 





62 



jD 

with the time to setup a machine to perform task n, plus the quantity of 

task n in period r times the capacity coefficient a^ of task n. Note, that the ca- 
pacity coefficient reflects the standard processing time to perform task n. If the 
production rate is above the standard production rate the additional capacity is in- 
cluded in and therefore does not need to be taken into account in different ca- 
pacity coefficients for one task. These considerations lead to the following con- 
straints: 

n T N N 

Rr“^^r - X^n ynT"^X^nT‘^n T = 1,...,T 

n=l n=l 

4. Limitation of overtime 

Overtime r!^ in period r is restricted to an upper limit r!^ which may be 
based on an agreement with the company’s workers’ council; 

R?<R? r = l,...,T 



5. Capacity adjustment through variation of production rates 
The additional capacity r\^ gained through performing task n at production rate 
m in period r reflects the quantity yj^^ times the difference of the capacity re- 
quired for performing the task in standard processing time a„ minus performing it 



in reduced processing time a^"' . This term is measured over all periods from T to 
the end of the planning horizon T: 



R im 
nr 



= I(an-a!,"')yls 

t=r 



n = l,...,N, r = l,...,T, m = 



The total additional capacity R^ in period r is then calculated as the sum of all 



R 



Im 

nr 



over all production rates m and all tasks n: 



. N . 

Rr= I iRn? 



m=l n=l 



r = l,...,T 



6. Capacity adjustment through production in advance 

Capacity adjustment through production in advance does not yield additional ca- 
pacity. Advanced production means to use capacity which is not required in for- 
mer periods. Hence, production in advance is not included in the calculation of 
additional capacity. Production in advance is useful if a production station is con- 
fronted with temporary bottleneck situations which can be healed by shifting pro- 
duction to earlier periods. Production in advance causes holding costs of the part 
which have to be kept in stock between the production period r and the period t 
when the part is required. In order to calculate the costs of production in advance 
in the goal function correctly, the performed tasks y\^ have to be multiplied by 





63 



the time difference of production in advance t-r and the capacity coefficient a^ 
of task n. The sum over all periods t = r + 1, . . . , T leads to the virtual capacity gain 
in period r through performance of task n weighted with t - r : 

A T 

Rm-= X(t-^) an ynzt n = l,...,N, t = 

t=T+l 

In the goal function, this weighted virtual capacity gain is multiplied by the hold- 
ing cost per capacity unit of task n. 

7. Capacity adjustment through external task performance 

If a station does not perform all tasks by itself it may be allowed to give orders to 
external suppliers. This leads to a further relaxation of the station’s bottleneck 
situation. The plant management or contracts with external suppliers may restrict 

S — s 

external task performance y^^^ to an upper limit y^^^ of task n in t. The required 
quantity of external parts is calculated through the difference of the demand x^t 
of task n in t minus the number of parts y of task n produced in r = 1, . . . , t for 

period t. This difference has to be equal or lower than the upper limit y^^^ : 

y„t =Xnt-Syna ^ynt n = l,...,N, t = l,...,T 

r=l 

Q 

The capacity relaxation through external task performance of task n in t is 

c 

measured as the quantity of externally purchased parts yj^^ multiplied by the ca- 
pacity coefficients a^ in case of internal performance at optimal production rate: 

Rnr=anrynt n = l,...,N,r = l,...,T mit;T = t 



8, Production within the given time windows 

All tasks have to be performed within the given time windows TW^ . Therefore, it 

is necessary to calculate the length of the time windows first. Here it is assumed 
that the length of the time windows depends on the processing time of each task. 
This means that the minimum length of a time window a„ TB^ is calculated as 
a given percentage TBj^ of the processing time of task n. Since this model 
measures time in discrete units, the earliest start of performing a task is 
t“[^n TBn ]-l . The brackets stand for the Gauss brackets. Production planning 
has to ensure that all preceding tasks are finished before this starting time. The fol- 
lowing restrictions guarantee that y^^ is zero if r lies before and is non-negative 



if r lies after the earliest starting time: 

r=l if T<t-TW„=t-[an-TBn]-l 

l>j if T>t-TW„ =t-[an -TB„]-l 



n = l,...,N, r = l,...,T, 



9. Non-negative constraints 





64 



Finally, the non-negative constraints guarantee that the amounts of the performed 
tasks and the overtime working hours are non-negative: 



>0 n = l,...,N, r = l,...,T, t = l,...,T, m = l,...,Mn 

y^j>0 n = l,...,N, t = l,...,T 



R 



O 

r 



>0 



r = l,...,T 



The non-negativity of all other decision variables (Rnr ^ ^nr » » ^nr » Ynrt ) 



follows automatically from the restrictions above. 



4. Conclusions 

The presented model allows bottleneck stations to quickly calculate an optimal 
capacity adjustment strategy. If it is programmed operator-friendly it offers the 
opportunity to adjust capacity to short term requirements cost efficiently. It is then 
suitable to support a decentralized manufacturing organization. Decisions requir- 
ing integer variables like the number of machine setups and the order sequencing 
are made by the station’s workforce and are therefore exogenous to the model. 
Since no integer variables are included, the model can be recalculated easily ac- 
cording to production progress. All in all, the model enhances the company’s 
flexibility to cope with bottleneck situations without compromising the advantages 
of decentralized organizations. 



References 

Gutenberg, E.: Grundlagen der Betriebswirtschaftslehre, Bd. 1, Die Produktion, 
Springer, Berlin / Heidelberg / New York, 24. Aufl. 1983. 

Letmathe, P.: Flexible Standardisierung - Ein dezentrales Produktionsmanage- 
ment-Konzept fiir kleine und mittlere Untemehmen, Gabler, Wiesbaden 2002. 





Performance Analysis of Make to Stock Supply 
Chains Using Discrete-Time Queueing Models 



Sandeep Jain and N. R. Srinivasa Raghavan 

Department of Management Studies, Indian Institute of Science, Bangalore, 
560012, INDIA 

email: {sandeep, raghavan }@mgmt.iisc.ernet. in 



Abstract. Manufacturing supply chains are formed out of complex interconnec- 
tions among several vendors, manufacturing facilities, warehouses, retailers, logis- 
tics providers. In this paper, we concern ourselves with a Supply Chain Network 
(SCN) consisting of single manufacturing plant, one warehouse and several retail 
outlets. We present a discrete-time queueing model which can be used for evalu- 
ating the performance of a given SCN which processes customer orders at discrete 
time intervals. It can also be used for determining the optimal inventory level at 
the warehouse that minimizes total cost of carrying inventory and back order cost 
associated with serving orders in the backlog queue, subject to service level con- 
straint in terms of expected waiting time for customers. The model is analyzed by 
using discrete-time queues. We use CONWIP as the inventory control model. 



1 Introduction 

Supply Chain Management (SCM) focuses on the co-ordination among enter- 
prises which work for procuring, producing, delivering products and services 
to customers who are at different geographical locations. Performance anal- 
ysis of SCNs is very complex and difficult. In this paper, we present a new 
methodology to evaluate the performance of 3-stage SCN. This model can 
be used to formulate and solve optimization problem in the area of inventory 
control. We briefly survey the literature on mathematical models for inven- 
tory control in supply chains. For an overview of various inventory models, 
the reader may refer [9] . Kim and Tang [4] highlight the trade-off between 
manufacturing lead time and response time. Bertsimas and Paschalidis [5] 
develop a model for devising the optimal production policy to minimize the 
inventory cost. Buzacott [8] is the first researcher who proposes queueing 
models for production systems including kanban based system. Bruneel [7] 
elaborates about the general discrete-time queue (DTQ) for the single server 
and infinite waiting room case. Supply chains can be viewed as discrete-events 
dynamic state (DEDS) [6]- order arrivals, goods arrivals and departure repre- 
sent the discrete events, themselves being dynamic. This paper is organized 
as follows: In section 2, we present the notation and the model. Some il- 
lustrations of performance analysis follow in section 3. We then present the 
optimization problem and results in section 4. 




66 



2 Notations and Model 

2.1 Notations 

Q : Number of units in one bucket 
K : Total number of buckets at warehouse 
A : Demand arrival rate at warehouse 
I{t) : Number of finished goods inventory at warehouse 
N{t) : Number of orders in the factory being processed 
B{t) : Number of back orders in the system at time t 
h : Inventory holding cost ($ per unit) 
b : Back order cost ($ per unit) 

p : Probability that processing of a batch of products will finish in Q 
slots of time at the manufacturing plant 
T : Waiting time for the orders placed by the retailers 
Mo{t) : Probability that zero external arrivals occur within a time interval 
of duration t where time interval is placed at random in time 
Mn{t) : Probability that n arrivals occur within a time interval of duration 
t where time interval is placed at random in time 
A{z) : Probability generating function (pgf) of arrival distribution 
S{z) : Pgf of service distribution 

In this paper we analyze a 3-stage supply chain which consists of one 
manufacturing plant, one warehouse, and n identical retailers (see fig.l). We 
model the warehouse inventory as the input control mechanism for the manu- 
facturing plant which itself is modeled as a single stage discrete time queueing 
system. The orders arrive at the warehouse from various retailers as a Poisson 
process with a constant rate A. The warehouse keeps finished goods inven- 
tory in K buckets, each of which holds exactly Q units. Arriving orders from 
retailers deplete the on-hand inventory at warehouse, if any. Otherwise, (in a 
stock-out situation) the arriving orders have to wait to be fulfilled. We refer 
this situation as “back order”. In our model, we assume that back orders 
can be infinite. The warehouse places orders to manufacturing plant when 
one bucket (Q units) is depleted. Thus, when the retail orders are Poisson, 
the inter-arrival time of the orders at the manufacturing plant is Erlang dis- 
tributed with Q phases and rate X/Q. We assume that manufacturing plant 
has infinite waiting line capacity. At the manufacturing plant, orders arrive in 
batches of Q units. The manufacturing plant processes orders at fixed discrete 
time slots. The time slots are decided appropriately. For instance, slots could 
be a couple of hours, one shift, a day or so on. In our model, the processing 
time of orders includes setup, manufacturing and logistics time. Orders ar- 
riving in between slots have to wait for the next slot for getting processed. 
The orders are processed based on the first-come-first-serve policy. We as- 
sume processing time is Negative binomially distributed with parameters p 
and Q. Our work represents the first attempt in terms of applying discrete 
time queueing models to supply chain analysis and design. 





67 




Manufacturing Plant Warehouse Retailers 



Fig. 1. 3-Stage Supply Chain Network 



3 Analysis 

At the warehouse there is some cost associated with keeping inventory and 
there is some backorder cost associated with orders in backlog. We assume 
that both inventory carrying cost and backorder cost are linear in nature. We 
would like to minimize the expected total cost at the warehouse. 
Mathematically we write 

Minimize Total Cost^^Q} = h E[I] -h b E[B] (1) 

We also want to guarantee the retailers that their expected waiting time will 
not exceed c time units 



E[T] < c (2) 

K,QeZ+ (3) 

We thus have formulated an optimization problem where (1) is the objective 

function, subject to constraints (2) and (3). In the manufacturing plant pro- 
cessing of orders starts at discrete time intervals; hence we can analyze the 
system as a discrete time queue. The pgf of number of units at the beginning 
of a random slot [7] can be given as 

?7(^) = [1 - A'(1)S'(1)] 

For stability A'(l)5'(l) < 1 
Let V{z) be the pgf of system occupancy at random time points, then V{z) 
can be given as 

V{z)= [ d9 L{l-X{e)-\-X{e)K{z)) .U{z) (5) 

Jo 

/(n) = Prob [n arrival instants in one slot], n > 0 

oo 

n=0 



{z-l)S{A{z)) 
z-S{A{z)) . 





68 



K{z) = 

A{z) = L{K{z)) 

The arrival process when the time between two arrivals is Erlang [1] is 









(6) 


Q-i 

S=0 


\(. s\{Qtr^^^ . f. 

[V Qj{nQ-s)l n Q J 


(Qf)nQ+«+l) 1 
' (nQ + s + 1)!_ 


(7) 




OO 

L{z) = Mo{t) + Y, Mn{t)z^ 

Tl'=z\ 




(8) 




Mx{s)Mo{t-s) 

m) 




(9) 




SU) - 




(10) 



For computing inventory and back-orders, we have to develop stochastic equa- 
tions which capture the properties of the system [2]. 

m = {KQ - N{t)}+ (11) 

B{t) = {N{t) - KQ}+ (12) 

KQ kq d^V(z\ 

E[I] = J^iKQ - n)P{N = n} = ~ (13) 

n=0 n=0 

E[B]= {KQ-n)P{N = n}= {n - (14) 

n=KQ n=KQ 

For computing the expected waiting time (in discrete time slots), we can use 
Little’s law, E[T] = E[B]/X. 

4 Implementation and Result 

We have implemented our model using Mathematica package. The equations 
(13) and (14) are not possible to solve analytically in terms of variables like 
K and Q. So, we solve the expression by substituting numerical values of 
K, Q and observe the behavior of output parameters. It is not possible to 
compute exact pgf of the Erlang arrival distribution in closed form. So we 
truncate summation to 10 terms instead of infinity. We observed the results 
varying summation limits up to 100. The results were differing merely by 
0.003%, so we found our approximation reasonable. For objective function 





69 




Total Inventofy (KQ) 



Fig. 2. Total Inventory (KQ) Total Cost when Q=3, A=0.90, p=0.95 




Fig. 3. Total Inventory (KQ) Vs Expected waiting time, A=0.90, p=0.95 



we plot graph (fig.2) for total inventory at warehouse (KQ) verses total 
cost. We keep Q constant and varying the ratio of back-orders and inventory 
(b/h) cost. Similarly for constraint we plot (fig.3) for total inventory verses 
expected waiting time which is in slots. After plotting these curves we try 
to fit some functions so that we can capture the nature of objective and 
constraint function in some form of eqns. In our model, for the objective 
function, we input the values of Q, K, b/h which gives the output value 
of total cost. Similarly for the constraint function, we input the value of Q 
which gives the expected waiting time (in slots). We solve the formulation 
as an Integer programming problem. We used branch and bound method [3]. 
We can see the results in table 1. The findings are as follows : 

1. As we increase Q, for achieving same service level, required inventory 
at warehouse is higher which increases total inventory carrying cost for 
same b/h ratio. 

2. As b/h ratio decreases, total inventory at warehouse is non-increasing. 

3. At higher value of Q, b/h has no say in determining K* (optimal K) . 



5 Conclusion 

In this paper, we have developed a model that examines the trade-off between 
total cost and total inventory at warehouse for the given customer service 
level. This model can be viewed as a starting point of modeling of SCN 





70 



Table 1. Results for A=0.9, p=0.95, c < 5 



Q 


b/h 


K’ 


Total Inventory 


Total Cost (in $) 


3 


5.0 


15 


45 


39.0725 


3 


2.0 


14 


42 


52.6487 


3 


1.0 


14 


42 


65.9768 


3 


0.5 


14 


42 


79.3044 


3 


0.2 


14 


42 


92.6322 


5 


5.0 


17 


70 


60.072 


5 


2.0 


17 


70 


94.0169 


5 


1.0 


17 


70 


127.962 


5 


0.5 


17 


70 


161.907 


5 


0.2 


17 


70 


195.852 



where service is provided at discrete time intervals. In the present work, we 
have considered a very basic model of inventory system. We assumed setup 
cost is zero. The model can be further analyzed assuming setup cost. We 
worked under the assumption that orders can be back ordered. The model 
can be extended for the lost sales case also. 

References 

1. M. M. Philip (1958) Queues, Inventories, Maintenance. John Wiley and Sons 
Inc. 

2. Buzacott, J. A. and Shanthikumar, J. G. (1993) Stochastic Models of Manufac- 
turing Systems. Prentice Hall, New Jersey 

3. Rao, S. S. (1998) Engineering Optimization: Theory and Practice (Third Edi- 
tion). New age International Ltd. Publishers 

4. Kim, Illyung and S. Tang, Christopher (1997) Lead time and response time in 
a pull production control system. European Journal of Operational Research. 
101, 474-485 

5. Bertsimas, D. and Ch. Paschalidis, loannis (2001) Probabilistic service level 
guarantees in make-to-stock manufacturing systems. Opers. Res. 49 , 0119-0133 

6. Viswanadham, N. and Raghavan, N. R. Srinivasa (2000) Performance analysis 
and design of supply chains: a Petri net approach. Journal of the Operational 
Research. 51 , 1158-1169 

7. Bruneel, Herwig (1993) Performance of discrete-time queueing systems. Com- 
puters Operations Research. 3, 303-320 

8. Buzacott J. A. (1989) Queueing models of kanban and MRP controlled produc- 
tion systems. Engineering Costs and Production Economics. 17, 3-20 

9. Hadley, G. and Whitin, T. M. (1963) Analysis of Inventory Systems.Printice-Hall 
Inc, Englewood Cliffs, NJ 






A New Optimal Demand Forecast Model 



Joachim Althaler, Herbert Jodlbauer 

FH-Steyr, Wehrgrabengasse 1-5, A-4400 Steyr, Austria. 
e-mail:joachim.althaler@fh-steyr.at, herbert.jodlbauer@fh-steyr.at 



Abstract 

This paper presents a model for determining an optimum demand plan which has 
been developed whereby both demand figures from the past as well as the results 
of market research concerning price elasticity are known. 

The sales demand demonstrates a high seasonal fluctuation as well as a short- 
term fluctuation, which is also relevant. The production capacity is restricted. 

The main idea of the model is first of all to approximate the past demand fig- 
ures in order to predict future demand figures through extrapolation. In the second 
step, a non-linear production optimisation model is applied for calculation of the 
cumulative demand plan taking into account both the price elasticity and resource 
restrictions. The final predicted demand data is shown by an extrapolated curve 
corrected by a constant. The latter is determined by the fact that the integral of the 
demand curve has to be the cumulative demand plan. 

In addition to this result, a beta level of service and consequently the level of 
the safety stock can be calculated by means of a curvilinear regression analysis. 



1 Introduction 

By using well-known methods in the field of non-linear regression analysis and 
non-linear production planning and combining these approaches, a new model for 
determining a sales forecast with maximum profit, high seasonal fluctuations and 
limited resources is introduced. The main idea is firstly, to analyse the past sales 
data structure in order to depict the basic shape of the forecast. In the second step, 
the total sales for the subsequent year are fixed by optimising the profit whereby 
resource-restrictions are taken into account. In the final step, the optimal sales 
curve is calculated by solving an integral equation describing the relationship be- 
tween the total sales and the extrapolated sales curve. 




72 



, 


sales PI 


60 40 20 0 

time 





Fig. 1. Past sales data 

The past sales data (//,jc/) whereby / = -T + l,...,0 is known. The values // are 
the sub time periods, for instance the past weeks and X/ the vector of total sales in 
the time period . T is the number of observed time periods, for instance 52 for 
one year. 

Typically, the sales data has seasonal fluctuations as well as short-term fluctua- 
tions as shown in Fig. 1 for two products. 

As a result of market research, the price-elasticity a(p) describing the relation- 
ship between the price p and the possible total sales a{p) for each product is given 
(see Fig. 2). In addition, lower and upper boundaries u and o for the sales are de- 
fined. For instance, the sales force knows that the maximum possible sales are 
limited by o or for strategic reasons a minimum u should be sold. 

The required resources (employees, machines, ...) for producing the products 
are fixed in the matrix M = {my) describing how many units of the resource i are 
necessary to produce one unit of the product j. All the resources, of course, are 
limited by the maximum available resources b. The vector c = (Cf) describes the 
costs of one unit of the resource /. 

The goal is to determine a forecast Xopt (t) of the sales of the next T periods us- 
ing all the given data above, especially the past sales data, the price elasticity and 
the resource restriction, whereby the contribution margin is maximized. 





Fig. 2. Price elasticity 





73 



2 Mathematical Model 

To begin with, a suitable parameterised function for each product Xj{t) has to be in- 
troduced in order to approximate the past sales data. The sales data is character- 
ized by seasonal fluctuations and a trend. The functions 

Xj (t) = dj + */ + 4 cos — ( ; - T J j ( 1 ) 

can meet these requirements in a more generalized case than the functions pro- 
posed in [1]. The parameters dj, kj, Aj and Tj are interpreted as follows: 

dj constant summand 

kj descent of the trend 

Aj amplitude of the seasonal fluctuations 

Tj phase translation. 

The shape of the curves are fixed by these parameters. If the quality of the sea- 
sonal fluctuations remains then the parameter Tj will be independent in time. In 

addition, constant kj and a constant ratio Aj of the amplitude of the seasonal fluc- 
tuations to the total sales are assumed. By varying the summand dj the total sales 
can be changed. 

The unknown parameters dj, kj, Aj and Tj are calculated by applying a non- 
linear regression analysis. The non-linearity is caused by the parameter tj . 

minimise a^j = 

Xj (t. ) approximated sales curve (product j ) 

Xj real past sales at time t. (product j) 

O^j deviation (product y). 

The idea in (2) is to minimise the deviation of the approximated sales ftinction to 
the real past sales data as shown in Fig. 3 for two products. 

It is not enough to extrapolate the approximated sales function to guarantee a 
maximised profit. Instead of this, a non-linear adjusted production planning model 





Fig. 3. Approximated past sales data 

is applied, whereby the parameters x j and kj are fixed by solving (2) and the two 
parameters Aj and dj are not yet optimised. The model to be solved is defined by 






74 



(/7^ -c^M^a{p) max. contribution margin 



(3) 



whereby the following conditions must hold: 



u < a(p) < o lower and upper boundaries for sales 
Ma{p) < b limited resources 







Ajip) 



ratio amplitude to total sales remains 

jxj{t)dt jyj{t,p)dt 

-T 0 

T 

Qj (p) = J yj (t, p)dt total sales 

^ (2k \ 

with yj{t,p) = dj{p) + kjt-^Aj{p)co^ 

Aj amplitude of the approximated curve 
Aj {p) amplitude of the unknown optimal curve 
dj (p) summand of the unknown optimal curve. 



(4) 

(5) 

( 6 ) 

(7) 

( 8 ) 



Solving problem (3) the optimal price p, the optimal amplitude Aj and the optimal 
constant summand dj is determined by maximising the contribution margin. By us- 
ing the price elasticity the optimal prize defines the optimal total sales. The ampli- 
tude Aj is calculated by ensuring a constant ratio amplitude to total sales. In order 
to determine the constant summand dj the relationship between the total sales aj(p) 
and the sales function has to be taken into account. 

Because of having the chance to separate problem (3), the solution can be car- 
ried out in two steps. The first step is to solve the optimisation problem (3), s.t. (4) 
and (5). 

This results in the optimal price p^^^ and the optimal total sales a{p ^^^) . 

The second and last step is to determine the optimal parameters Aj{p^^^), 
dj(Popt) and the optimal sales forecast x^p^(t) calculating subsequently 



AjiPop,) = aj(Pop,)T 



A: 



(9) 



jxj(t)dt 

-T T 

aj {Pop, ) - 1 kjt + Aj (p^^, )coJ^[t-Tj) \dt 



( 10 ) 

^j{Pop,)~ 

^^<^Xo^,{t) = y{t,Pop,)- 

Equations (9), (10) and (11) are a direct result of “the constant ratio amplitude to 
total sales equation” (6), “the total sales equation” (7), the definition of yj(t,p) in 
(8) and additionally of the definition of Xj(t) in (1). 

As Fig. 4 shows, the optimal forecast for the product P 2 is drastically below the 
extrapolated function. More contribution margin and profit is earned by reducing 





75 



the total sales but by increasing the price. For the product there is an opposite 
result. Because of the increased total sales of product the amplitude of product 
changed significantly. 




3 Additional Results 



The short-term fluctuations cause a fluctuation in the production load as well as in 
the sourcing demand. One idea for handling the short-term fluctuation in the 
sourcing is to define a safety stock level (see for instance [2, 3]) for the sourcing 
warehouse. The safety stock level should guarantee to a high percentage p that 
both the forecasted sourcing demand within the sourcing time period and the addi- 
tional sourcing demand caused by short-term fluctuations can be dealt with. For 
simplicity (12) is conceived only for one specific sourcing product: 



P 



f Tc 



j ^i+r “ ^y9-safety 



whereby 




availability condition 



^y9-safety Safety stock with safety-level p 

e □ sourcing demand caused by vector of real sales 
S sourcing matrix 

sourcing time for a certain product 
(number of sub periods, for instance days) 
p(E) probability of some event E. 



( 12 ) 





76 



Formula (13) is an explicit expression for the safety stock and is the result of ap- 
plying the normal distribution transformation on the availability condition in (12) 
and taking into account that the real demand s. is a random number with deviation 
errand mean s(t .) . 

^/?-safety “ ^ j ) ’^N(0,\) (13) 



whereby 

) = Sx{t . ) G □ sourcing demand caused by the forecasted 
sales-vectorx(^.), 5* sourcing matrix 

~ deviation of the sourcing demand 

, . T deviation vector of the approximated sales functions 

^x={^x\ ^xn) 

(for n products) to the past sales data 
j) quantile of the standard normal distribution A^(0, 1). 



As shown in Fig. 5, the real sourcing demand within the sourcing time period is 
given by the area below the real sourcing data and between the two arrows. The 
safety stock level is defined by the area, which can be seen in Fig. 5 between the 
upper safety boundary curve and the forecasted sourcing demand curve. 




Fig. 5. Illustration of the safety stock level 



References 

1. Tempelmeier H (1999) Material-Logistik: Modelle und Algorithmen fiir die 
Produktionsplanung und -steuerung und das Supply-chain-Management. 4., 
iiberarb. u. erw. Aufl. Springer, Berlin Heidelberg New York, pp 88-90 

2. Grubbstrom RW, Tang O (1999) Further Developments on Safety Stocks in an 
MRP System applying Laplace Transforms and Input-Output Analysis. In: In- 
ternational Journal of Production Economics, Vol. 60-61, pp 381-387 

3. Minner S (Ed.) (2000) Strategic Safety Stocks in Supply Chains. In: Lecture 
Notes in Economics and Mathematical Systems, Vol. 490, Springer, Berlin 
Heidelberg New York, pp 33-53 






A Multi-Product Batch-Available-to-Promise 
Model for Make-to-Stock Manufacturing 



Richard Pibemik 

Seminar fur Logistik und Verkehr, Johann Wolfgang Goethe-Universitat, 
Mertonstr. 17, 60054 Frankfurt am Main, pibemik@wiwi.uni-frankfurt.de 



1 Introduction 

Available-to-Promise (ATP) comprises of a variety of tools that enhance the re- 
sponsiveness of order promising and the reliability of order fulfilment. Based on 
customer requests (i.e. requested product, order quantity and delivery time win- 
dow) they support “order quantity” and “order due date quoting”. 

ATP is usually integrated in Enterprise Resource Planning (ERP) systems and 
Advanced Planning Systems (APS). Their functional scope can vary significantly. 
“Conventional” ATP, commonly implemented in ERP systems, merely determines 
the availability of finished goods at a certain point of time in the future. “Ad- 
vanced” ATP provides a broader scope of functions. In this paper. Advanced ATP 
is considered as a decision-making system which simultaneously allocates avail- 
able finished goods inventory to customer orders and quotes order due dates in a 
make-to-stock manufacturing environment. 

The major goals pursued with the implementation of ATP are the improvement of 
on time delivery by generating reliable quotes, the reduction of the number of 
missed business opportunities by employing more effective methods for order 
promising and an increase of revenue and profitability by increasing the average 
sales price (see [4]). 

The development of methods and their application to support order promising has 
primarily been driven by providers of ERP and APS (e.g. SAP, Manugistics, i2 
Technologies). Up to now, a very limited number of theoretically founded contri- 
butions have been made. A variety of papers discuss the needs or propose features 
for ATP Systems (e.g. [3], [4] and [6]). Very few contributions provide quantita- 
tive models or algorithms for quantity and due date promising. Major contribu- 
tions have been made by Chen/Zhao/Ball ([1],[2]). The authors develop a mixed- 
integer programming ATP model that allocates resources among customer orders 
that arrive within a pre-determined time interval (batching interval). Their contri- 
bution, however, is focused on a specific configure-to-order case. In this paper a 
batch ATP model for due date quoting on the basis of finished goods inventory is 
presented. This model can be customized in order to support company specific 
needs and is therefore applicable for any firm employing a make-to-stock manu- 
facturing strategy. 

The remainder of this paper is organized as follows: In section 2 we first describe 
the ATP scheduling problem and introduce the relevant notations. Thereafter we 
present a mixed integer programming formulation applicable for generating ATP 




78 



schedules and characterize major applications and extensions. The paper con- 
cludes with a summary of the findings. 



2 The Batch Available to Promise Model 

2.1 Problem Description and Notations 

Consider a manufacturer of N different products n = 1,...,N. The manufacturer 
employs a make-to-stock production strategy. Let [t^jt^] be the ATP planning 

horizon, consisting of T discrete time periods [t^, t^ + 1],. -,[te “ ht^] . The point of 
time when the model is being executed is denoted by t^ . Order promising deci- 
sions will be made for a batch of potential customer orders, collected within the 
time interval [t^ (batch-interval), where T represents the length of the 

batch interval. A( t^ ) denotes the set of potential customer orders collected within 
the batch-interval. Every potential order i e A(t^) can be characterized by a triple 
(di,z“,z°), with dj = (dj,...,df^) the vector of order quantities for the N pro- 
ducts, z“ e [t^ +l,...,tg} the earliest date of delivery and z- g {t^ z? > z" 

the latest date of delivery.^ The interval [z“ ,z° ] represents the customer’s deliv- 
ery time window, with zf = z° if the customer insists on a fixed delivery date. 

We assume, the manufacturing schedule is fixed for the ATP planning horizon 
[t^,t J . At every point of time t ( t = t^ +1, -,te ) a given quantity of q” units of 
product n are produced and put into stock. For simplicity and without loss of gen- 
erality, we further assume that the q" units can be delivered to a customer at the 

point of time t and that the delivery time is zero. Let b" denote the inventory on 
hand at point of time t ( t = t^ + l,...,tg ). We assume, the inventory on hand b^^^j 
at t^ + 1 is determined by the ATP run executed at point of time t^ - T . The in- 
ventory on hand b" and the production quantity q" determine the quantity of 

product n, “available to promise” to a customer at point of time t. 

We assume, an order can, in certain cases, also be fulfilled by means of substi- 
tute products.^ Let M(n) denote the set of ordered tuples of substitution possibili- 
ties for product n: 



' For simplicity, we assume that the customer specifies the earliest and latest due date for 
all ordered products. Considering different due dates, however, is not difficult: The or- 
ders can be divided into sub orders with different due dates which are then subject to or- 
der promising. 

^ A manufacturer of hard disks, for example, may deliver drives with a higher capacity if 
he runs out of drives with lower capacity. 




79 



. product n can be employed as a substitute for n 
ii€ {l,...,N};n ^ n 

Let M be the set union of all sets M(n) (n = 1,. . .,N). 

Apparently, the firm has to solve an assignment problem. The potential cus- 
tomer orders, collected within the batch-interval have to be assigned to the quanti- 
ties of the N products, available to promise during the time interval [t^ +1, t J 

We will call a (feasible) solution of this assignment problem an “ATP schedule”. 
This schedule specifies, which orders are fulfilled by which quantities of the N 
products at which point of time within the ATP planning horizon. 

In order to determine an ATP schedule with the batch model, presented in the 
following section, we define the following variables: 

x”(t) : Delivered quantity of product n at point of time t in order to fiilfil cus- 
tomer order i. 

xf ’”(t) . Delivered quantity of product h as substitute for product n at point of 
time t in order to fulfil customer order i. 

-n,ii _ fl,if product n is substituted by product h in order to fulfil order i 
i \0,else 

11 _ /l, if due date of order i is t 

^ |o,else 

1, if order i is fulfilled within [z“ , z° ] 

0, else 

With the decision variables xf (t) and xf ''(t) an ATP schedule can be repre- 
sented by an (n(A(t^)))xT -matrix x.^ The components of x are the ordered tu- 
ples (Xi(t),Xi(t)) (iG A(tJ,t = t, +l,...,tjwith 

'x|’^(t) = 0 •• x^’(t) 

Xi(t) := (x|(t),...,xf'(t)) and x^Ct) := i i 

^ x|’^(t) x^^(t) = 0 

It is the firm’s objective to determine an ATP schedule which maximizes over- 
all net profit, denoted by P, within the time interval [t^ + 1, t^] . In order to calcu- 
late the overall profit, associated with an ATP schedule, we define: 
p" : profit contribution of one unit of product n with regard to order i.^ 



^ M(n) can be defined individually for every customer order ie Aft^) or for different 
types of customer orders. For simplicity we assume, M(n) is applied to every potential 
customer order. 

The first delivery can take place at point of time t^ + 1 , i.e. at the point of time after exe- 
cuting the Model. 

^ n(A(tg )) denotes the number of Elements in the set A(t^ ) . 

^ We assume that the profit contribution does not include holding costs. 









80 



pf : profit contribution of one unit of product n when used as substitute for n 

with regard to order i ( pf < pf ).^ 
he" : holding costs for one unit of product n. 

dpj : penalty associated with the denial of customer order i.^ 



2.2 Model Formulation 



To determine an ATP schedule, which maximizes the overall profit P, we can em- 
ploy the following mixed-integer-programming model: 



max 



t=ta+l ieA(ta) n=l t=ta+l ieA(ta) n=l (n,n)6M(n) 



(3) 



ieA(ta) 



s.t. 

(n,n)EM(n) 



(4) 



for all i€ A(t 3 );n = 1,—,N; t = +l,...,t^ 



K=b;‘.,+q;‘- Xx”(t)- X 

ieA(ta) ie A(ta ) (n,n)€M 





n= l,...,N;t = t,+l,...,t, 




Jui(t) = Vj 

t=zP 


for all ie A(t^) 


(6) 


Xr“- <? 

(n,n)eM(n) 


for all ie A(t^);n = 1,...,N 


(7) 


x"’=(t)<r"’" •dj’ 


for all ie A(tJ,(n,h)G M(n);n = 1,...,N 


(8) 


x”(t)>0 


for all ie A(t^);n = 1,...,N 


(9) 


x^”(t)>0 


for all ie A(tJ,(n,h)e M(n);n = 1,...,N 


(10) 



^ If a substitute product is used to fulfil an order, it has to be considered that the customer 
will usually not pay the regular price for the substitute, but rather the lower price for the 
product, originally ordered. 

^ dpi has to account for contract penalties, loss of profits associated with order i and loss 
of future profits, if the customer switches to a different supplier. 





81 



b" >0 


n= l,...,N;t = t, +l...,t. 


(11) 


«i(t)e{0,l} 


for all i€ A(t 3 );z“ < t < z° 


(12) 


Vi € {0,1} 


for all IgAOj) 


(13) 


r" " € {0,1} 


for all ie A(t 3 ),(n,n)e M(n);n = 1,...,N 


(14) 



The objective function (3) accounts for both tangible and intangible terms. The 
tangible terms include profit from promised orders (taking into account the deliv- 
ery of originally ordered products and substitutes) as well as inventory costs for 
finished goods. The intangible term represents penalties associated with order de- 
nial. A feasible solution (ATP schedule) is subject to the constraints (4)-(14). Con- 
straint (4) ensures that the delivered quantity is equal to the ordered quantity for 
every accepted order. Balance of finished goods inventory is provided by (5). Note 
that the last term in (5) accounts for the quantity of product n, used for substitu- 
tion. Constraint (5) and constraint (11), which models non-negativity of inventory 
on hand, ensure that only quantities “available to promise” are actually assigned to 
customer orders. Constraint (6) ensures delivery within the delivery time window 

[ , z- ] and determines the value of the binary variable v^ . Constraint (7) limits 

the number of products, employed for the substitution of a product n to r . We as- 
sume, the customers will not accept the delivery of a number of substitute pro- 
ducts greater r . Constraint (8) determines the value of the binary variable r"’*' . 
Constraints (9), (10) and (11) assure nonnegativity of delivered quantities and in- 
ventory on hand. Constraint (12) defines the binary variable u-(t) within the de- 
livery time window [ z^ , z° ]. 



2.3 Applications and Extensions 

The model presented in the previous section is suitable for implementation in a 
software application, which supports order promising on a daily basis. As a com- 
ponent of a supply chain management system it can then obtain the relevant data, 
needed to solve the presented scheduling problem, from ERP and APS. Apart 
from application in day-to-day order promising, the model can also be employed 
in order to support tactical decisions, e.g. concerning the length of the batch inter- 
val (see [2]) and the consideration of customer specific order priorities. 

The model can easily be enhanced to account for a variety of additional func- 
tionalities. It can, for example, be modified in order to allow for partial deliveries. 
If the order quantity is not available within the given delivery time window, the 
customer order can be fulfilled with two or more partial deliveries, whereas the 
first partial delivery is carried out within the given time window. The model then 
determines the quantities and delivery dates for each partial delivery (see [5]). 

The model can also be applied to a distribution network, rather than only to a 
single location. It then has to be modified to account for different manufacturing 
and transportation lead times and costs, depending on the considered locations. 





82 



3 Summary 

In this paper we presented a basic batch ATP model, which can be employed to 
support day-to-day order promising of a firm, applying a make-to-stock manufac- 
turing strategy. Based on potential orders with specific customer requests (re- 
quested product, order quantity and delivery time window) and the product quanti- 
ties „available to promise“, the model generates an ATP schedule for a pre- 
defined planning horizon. This schedule specifies order due dates as well as prod- 
uct quantities and product types, used to fulfil the set of potential customer orders. 
The model can be customized in order to account for firm specific needs. It is suit- 
able for implementation in a software application to support order promising, 
which can be integrated in ERP and APS. 



References 

[1] Chen C-Y, Zhao Z-Y, Ball MO (2000) A Model for Batch Advanced Available-To- 
Promise. To appear in: Production and Operations Management, available Internet: 
http://bmgt 1 -notes.umd.edu/ facultv/km/papers.nsf . 

[2] Chen C-Y, Zhao Z-Y, Ball MO (2001) Quantity and Due Date Quoting Available To 
Promise. Information Systems Frontiers 3:4: 477-488. 

[3] Fischer ME (2001) „Available to Promise^: Aufgaben und Verfahren im Rahmen des 
Supply Chain Management. Regensburg. 

[4] Kilger C, Schneeweiss L (2000) Demand Fulfilment and ATP. In: Stadtler H, Kilger C 
(Eds) Supply Chain Management and Advanced Planning. Berlin et al, pp 79-95. 

[5] Pibemik R (2002) Advanced Available to Promise: Models and Algorithms for Order 
Promising and Fulfilment. Working Paper Lehrstuhl Logistik, Universitat Frankfurt. 

[6] Robinson A, Dilts DM (1999) OR and ERP: Can operations research play a role in 
fast-growing, enterprise- wide information systems?. OR/MS Today 26/3: 30-37. 






Scheduling of Rolling Ingots Production 



Christoph Schwindt and Norbert Trautmann 

Universitat Karlsruhe, Institut fiir Wirtschaftstheorie und Operations Research, 
D- 76 128 Karlsruhe, Germany, 

e-mail: {schwindt , trautmann} Owi or . uni-karlsruhe . de 



Abstract. We consider a real-world scheduling problem arising in the context of 
a rolling ingots production. We review the production process and discuss pecu- 
liarities that have to be observed when scheduling a given set of production orders 
on the production facilities. We then describe a model for this scheduling prob- 
lem using prescribed time lags between operations, different kinds of resources, and 
sequence-dependent changeovers. The basic principle of the solution procedure is to 
relax the resource constraints by assuming infinite resource availability. Resulting 
resource conflicts are then stepwise resolved by introducing precedence relation- 
ships among operations competing for the same resources. The algorithm has been 
implemented as a beam search heuristic enumerating alternative sets of precedence 
relationships. 



1 Introduction 



This paper deals with a scheduling problem arising in aluminium industry. 
We consider the production of rolling ingots, i.e. ingots of a certain aluminium 
alloy in rectangular form. These ingots are the starting material for the rolling 
of sheet, strip, and foil. The production flow is as follows (cf. Kammer 1999, 
Section 1.4, and Figure 1). 






In a melting furnace called pot room, 
the ingredients composing the al- 
loy are smelted in an electrolyti- 
cal process. In general, several al- 
ternative potrooms are available. 
One or several casting units belong 
to each potroom. A casting unit 
consists of a holding fixture for a 
mould, a retractable hydraulic cylin- 
der named stool, and a so-called 
stool-cap, which closes the bottom 
of the mould at the start of the cast- 
ing process. The melt is cast through 
the mould, which determines the 
Fig. 1. Production flow chart cross-section of the ingot. 

As soon as the metal in the mould begins to solidify, the stool is lowered 
and the resulting ingot is cooled by spraying water. The maximum stroke 







84 



of the stool determines the maximum cast length, which implies that not 
every ingot can be produced on each casting unit. However, all casting units 
belonging to one potroom are of the same height. After the completion of the 
casting process, the ingot stays some time in the casting unit for cooling. All 
ingots produced within one casting are of the same alloy and same length. 
The casting has to be started and completed at the same time at all casting 
units of a potroom. It is not necessary to use all casting units during a 
casting process nor to use their full length. When passing to the casting of 
an ingot with a different cross-section, the mould of the casting unit has to 
be changed. The changeover can only be performed when no casting is in 
process. Moreover, only one mould per potroom can be exchanged at a time. 

The production scheduling problem is as follows. Given a set of production 
orders for ingots characterized by their size and alloy, the problem consists of 
computing a feasible production schedule with minimum makespan. To the 
best of our knowledge, there is no scheduling procedure known from literature 
that can be applied to this scheduling problem. Some approaches have been 
proposed for related problems (cf. Harjunkoski and Grossmann 2001, Kempf 
et al. 1998, and Fleischmann and Jess 1985). 

2 Temporal and resource constraints 

In this section we discuss the set of operations to be scheduled, the prescribed 
temporal relationships between individual operations, and the different types 
of resource constraints. For a formal statement of the production scheduling 
problem we refer to Schwindt and Trautmann (2002). 

Each production order for an individual ingot corresponds to a job con- 
sisting of the three operations melting, changeover of the mould, and casting 
and cooling. The duration of a melting operation equals the melting plus the 
casting time because the potroom is occupied up to the end of the casting. 
The duration of a casting and cooling operation corresponds to the time 
needed for casting and cooling the ingot. The duration of a changeover oper- 
ation depends on the sequence in which the jobs are processed on the casting 
units. If two consecutive jobs require different mould types, the duration of 
the changeover operation equals the setup time for installing the mould be- 
longing to the second job. Otherwise, the mould need not be replaced, and 
the duration is equal to zero. 

Since the melting operations may be performed in alternative potrooms, 
we define a set of alternative execution modes for each job, where each job 
has to be carried out in exactly one mode. 

The following temporal constraints arise from technological require- 
ments: 

1. The mould has to be installed before the casting, i.e., there is a minimum 
time lag of 0 between the completion of the changeover and the start of 
the casting operation. 





85 



2. Since the potroom remains occupied up to the end of the casting, melting 
and casting must be completed simultaneously. Thus, there is a minimum 
and a maximum time lag equal to the cooling time between the comple- 
tion of the melting and the completion of the casting operation. 

Basically, all production facilities (potrooms, moulds, and casting units) 
correspond to renewable resources (cf. Brucker et al. 1999). An opera- 
tion takes up a mode-dependent number of units of each resource. For each 
renewable resource, no more units than available can be used simultaneously. 

In a potroom, only one alloy can be melted at a time. This condition 
can be taken into account by using the concept of batching machines 
and incompatible job families discussed in literature on machine scheduling 
(cf. Potts and Kovalyov 2000, Uzsoy 1995). A batching machine has to be 
operated in a way that operations running in parallel must be started at the 
same time. In case of incompatible job families, those operations must also 
belong to the same batching type. Each potroom is modelled as a batching 
machine whose capacity equals the number of casting units that belong to 
the potroom. The batching type of a job identifies the alloy and length of the 
corresponding ingot. 

Obviously, it is not possible to use the installed mould for processing 
another ingot before the casting has been completed. To model the latter 
requirement, we introduce a set of allocatable resources. The units of an 
allocatable resource used for executing an operation remain occupied from 
the start of a given allocating operation up to the completion of the operation. 
For each allocatable resource, at no point in time more units than available 
can be allocated. Moulds of the same type form an allocatable resource whose 
capacity is equal to the number of moulds of that type. A mould gets allocated 
to a casting operation at the start of the corresponding changeover operation. 

We associate each group of casting units belonging to one potroom with a 
changeover resource (cf. Neumann et al. 2001) whose capacity equals the 
number of casting units in the group. The changeover times on this resource 
coincide with the durations of the respective changeover operations. 

Finally, only one of the casting units belonging to a potroom can be 
changed over at a time. Moreover, a mould cannot be installed on any casting 
unit during a casting operation. We take those requirements into account 
by defining an extra renewable resource for each potroom, where again the 
capacity is chosen to be equal to the number of casting units of the potroom. 
The full capacity of this resource is taken up by the changeover and one unit 
by the casting and cooling operations. 

3 Solution method 

The solution procedure is based on a schedule-generation scheme for resource- 
constrained project scheduling presented in Neumann et al. (2001). The con- 
straints that make the problem intractable are 





86 



1. the necessity to select a mode for each job, 

2. the limited capacity of the renewable resources, 

3. the requirement that operations running in parallel on batching machines 
must be of the same batching type and must be started jointly, and 

4. the need for changeover operations with sequence-dependent duration. 

If we relax all those constraints, the remaining problem consist of scheduling 
all operations subject to the temporal constraints. This temporal scheduling 
problem can efficiently be solved by standard network-flow algorithms (cf. 
Ahuja et al. 1993). 

An optimal solution to the temporal scheduling problem may be infeasible 
for the production scheduling problem due to two reasons: (i) there may be 
jobs for which no mode has been selected so far, or (ii) one of the resource 
constraints may not be met. In case (i), we select some mode for a job whose 
mode has not been fixed so far. In case (ii), the violation of some resource 
constraint at a given time is resolved by introducing appropriate time lags 
between operations competing for the respective resource (cf. Schwindt and 
Trautmann 2002). By assigning modes to jobs and adding time lags, we ob- 
tain a new scheduling problem with a refined relaxation, which belongs to a 
reduced set of feasible modes and to an expanded set of temporal constraints. 
The selection of modes and the addition of arcs is continued until either a 
mode has been selected for each job and the solution to the temporal schedul- 
ing problem provides a feasible schedule or the temporal scheduling problem 
is unsolvable. Figure 2 summarizes this schedule-generation scheme. 

Due to the sequence-dependent need for changeovers, the resource de- 
mands over time arising from the changeover operations are not known until 
the job sequences on the changeover resources have been established. That 



Relax resource and mode- 
assignment constraints 



Relax 



Calculate earliest schedule 
subject to 

^ temporal constraints 



Add new temporal 
constraints 



Schedule 



Resolve 



all modes Schedule 

selected? feasible? 




Extend mode assignment 



Fig. 2. Schedule-generation scheme 

















87 



is why we follow a heuristic two-phase approach where in the first phase, 
while observing the changeover times, we assume that changeover operations 
do not require resources for their execution. In the terms of our production 
scheduling problem the latter assumption means that we allow for paral- 
lel changeovers on casting units belonging to one and the same potroom. 
When we have obtained a feasible schedule for this subproblem, we fix the 
job sequences on the changeover resources (and thus the durations of the 
changeover operations) by introducing a precedence constraint between any 
two consecutive operations. We then proceed with the second phase, where 
we apply the branching process subject to the time lags introduced thus far. 



4 Computational results 

In this subsection we return to the example from Section 1, which serves as 
a basis to benchmark problems for testing the efficiency of the scheduling 
procedure proposed. Ingots can be cast in three different alloys, two different 
lengths, and four different cross sections. For each of the four mould types, 
two moulds are available. Long ingots can only be produced in potroom 2. 
Table 1 gives the alloy, the length, the mould type corresponding to the cross- 
section, and the durations of the individual production steps for the different 
ingot types. We assume that installing the mould takes 10 minutes. 

Based on the schedule-generation scheme described in Section 3, we have 
implemented a depth-first search branch-and-bound procedure enumerating 
alternative mode assignments and alternative sets of minimum time lags be- 
tween operations. We have evaluated the performance of a truncated version 
of the branch-and-bound algorithm (a filtered beam search) using 15 instances 
which have been generated by varying the number of production orders for 
the different ingot types of our example. For each instance, we have imposed 
a CPU time limit of one minute to the scheduling procedure running on a 
Pentium-800 PC. The makespans obtained for the 15 instances are shown in 
Table 2. The results for instances 6 to 10 indicate that the algorithm scales 
well. In particular, the makespan-per-order ratio seems to be independent of 
the problem size, though the CPU time limit has not been varied. 



Table 1. Ingot data 



Ingot Type 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


Alloy ID 


1 


1 


1 


1 


2 


2 


2 


3 


3 


3 


Length (s: short, 1: long) 


s 


1 


s 


1 


s 


1 


1 


s 


s 


1 


Mould Type 


1 


2 


3 


4 


1 


1 


2 


3 


4 


4 


Duration Melting [min] 


56 


56 


56 


56 


73 


73 


73 


57 


57 


57 


Duration Casting [min] 


60 


80 


60 


80 


60 


80 


80 


60 


60 


80 


Duration Cooling [min] 


45 


45 


45 


45 


70 


70 


70 


75 


75 


75 





88 



Table 2. Results 



Ingot Type 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


Makespan 


Orders Instance 1 


3 


3 


3 


3 


3 


3 


0 


0 


0 


0 


816 


Orders Instance 2 


0 


3 


3 


3 


3 


3 


3 


0 


0 


0 


834 


Orders Instance 3 


0 


0 


3 


3 


3 


3 


3 


3 


0 


0 


848 


Orders Instance 4 


0 


0 


0 


3 


3 


3 


3 


3 


3 


0 


816 


Orders Instance 5 


0 


0 


0 


0 


3 


3 


3 


3 


3 


3 


917 


Orders Instance 6 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


485 


Orders Instance 7 


2 


2 


2 


2 


2 


2 


2 


2 


2 


2 


829 


Orders Instance 8 


3 


3 


3 


3 


3 


3 


3 


3 


3 


3 


1421 


Orders Instance 9 


4 


4 


4 


4 


4 


4 


4 


4 


4 


4 


1664 


Orders Instance 10 


5 


5 


5 


5 


5 


5 


5 


5 


5 


5 


2131 


Orders Instance 11 


3 


6 


9 


12 


15 


3 


6 


9 


12 


15 


5121 


Orders Instance 12 


6 


9 


12 


15 


3 


6 


9 


12 


15 


3 


5446 


Orders Instance 13 


9 


12 


15 


3 


6 


9 


12 


15 


3 


6 


4112 


Orders Instance 14 


12 1 


15 


3 


6 


9 


12 


15 


3 


6 


9 


4731 


Orders Instance 15 


15 


3 


6 


9 


12 


15 


3 


6 


9 


12 


5626 



References 

1. Ahuja, R., Magnanti, T., Orlin, J. (1993): Network Flows. Prentice Hall, En- 
glewood Cliffs 

2. Brucker, P., Drexl, A., Mohring, R., Neumann, K., Pesch, E. (1999): Resource- 
constrained project scheduling: notation, classification, models, and methods. 
European Journal of Operational Research 112 , 3-41 

3. Fleischmann, B., Jess, H. (1985): Erfahrungen mit einem Simulationsmodell in 
einer Aluminiumgiefierei. OR Spektrum 7, 175-185 

4. Harjunkoski, L, Grossmann, I. (2001): A decomposition approach for the 
scheduling of a steel plant production. Computers Sz Chemical Engineering 
25 , 1647-1660 

5. Kammer, C. (1999): Aluminium Handbook Vol. 1: Fundamentals and Materials. 
Aluminium, Dsseldorf 

6. Kempf, K.G., Uzsoy, R., Wang, C.S. (1998): Scheduling a single batch process- 
ing machine with secondary resource constraints. Journal of Manufacturing 
Systems 17 , 37-51 

7. Neumann, K., Schwindt, C., Zimmermann, J. (2001): Project Scheduling with 
Time Windows and Scarce Resources. Springer, Berlin 

8. Potts, C., Kovalyov, M. (2000): Scheduling with batching: A review. European 
Journal of Operational Research 120, 228-249 

9. Schwindt, C., Trautmann, N. (2002): Production Scheduling with Batching 
Resources: Industrial Context, Model, and Solution Method. Report WIOR- 
618, University of Karlsruhe 

10. Uzsoy, R. (1995): Scheduling batch processing machines with incompatible job 
families. International Journal of Production Research 33, 2685-2708 





Determination of Economic Production Quantity 
for a Muiti-Stage Production System with Limited 
Storage Capacity 



U. Buscher^ and G. Lindner^ 

^ Dresden University of Technology, „Friedrich List“-Faculty of Transportation 
Sciences, D-01062 Dresden 

^ Otto-von-Guericke University Magdeburg, Faculty of Economics and Manage- 
ment, D-39016 Magdeburg 



This paper describes a model for a multi-stage production system in which a 
uniform lot size is produced through all stages with a single setup and without in- 
terruption at each stage. Transportation of partial lots, called batches, is allowed 
between stages before the whole lot is completed. Although the batch size must be 
equal at any particular stage, the optimal number of equal-sized batches may differ 
across stages. Additionally, we assume that inventories between consecutive 
stages are constrained by a limited storage capacity. Considering setup costs, in- 
ventory holding costs, and transportation costs, an optimisation method is devel- 
oped to determine the economic production quantity and the optimal batch sizes 
for each stage. 



1 Introduction 

The production of parts for a complex product often involves a series of opera- 
tions (stages). Although the assembly of the final product is frequently continuous 
over time, the fabrication of parts is performed in lots since the rhythm of the as- 
sembly line is seldom synchronised with part-manufacturing. Because of this, 
items facing a continuous demand are manufactured intermittently. Moreover, in 
many cases production rates differ between adjacent stages which leads in effect 
to corresponding work-in-process inventories. 

In such a context, a fundamental shortcoming of most conventional lot size 
models is that they pay no attention to the fact that produced items have to be 
transported from one stage to the next. Implicitly, they restrict the number of con- 
veyances per lot to only two alternatives. Either complete lots are transferred or 
each item is shipped immediately after its completion. It seems to be more realistic 
also to allow the transport of partial lots (batches) larger than one and smaller than 
the entire production quantity (see e.g. [1,4, 5]). 

Furthermore, in practice the storage space for each work-in-process inventory is 
never unlimited. Thereby, the usable total capacity of the before mentioned inven- 
tories is influenced by the storage method selected. In this paper, we assume that 
each item has a fixed storage bin dedicated solely to it. Including restricted storage 




90 



capacity in lot-sizing models is well known. Nevertheless, in the overwhelming 
number of cases, systems are examined which consist of only one inventory. 

In the remainder of this article, we consider a multi-stage production system 
having multiple inventories with constrained storage capacities. The authors know 
of only one other approach in this area of investigation (see [2]). By modelling 
explicitly the influence of transportation frequencies on the maximum process in- 
ventory, a suitable formulation of the storage capacity restriction will be achieved. 
In consequence the range of feasible stage-specific lot sizes expands in compari- 
son to [2]. This, in turn, opens up possibilities for cost-savings. 



2 The Model and the Cost Function 



This section deals with a single-item multi-stage Economic Production Quantity 
(EPQ) model. The analysed serial system consists of an unrestricted number of S 
manufacturing stages, s = 1, 2, . . ., S. The stage that meets the demand for the fin- 
ished product is stage S + 1. The main characteristics of our model can be summa- 
rised as follows. 

The production process of the considered item is comprised of a fixed sequence 
of operations to be carried out. At each stage one facility is available for perform- 
ing the respective operation. The corresponding production rates Pg are assumed to 
be finite, deterministic, and constant. Thereby, a uniform lot size Q is produced 
through all stages of the underlying system. This quantity is fabricated without in- 
terruption and with a single setup at each stage. A setup causes a stage-specific 
fixed setup cost Sg. Before the whole lot is finished, equal sized batches qg can be 
transported from one stage to the next. For every shipment a fixed stage-specific 
transportation cost Tg is charged. Moreover, a different (integer) number mg = 
Q/qg of batches can be realised at the various stages. 

By calculating the EPQ we have to ensure that the maximum process inventory 
at stage s does not exceed the related storage capacity Lg. All other capacity con- 
straints of the production system will hold. The cost of carrying one unit of physi- 
cal inventory over one unit of time at a particular stage is represented by Cg. Addi- 
tionally, the units of the item shall be infinitely divisible, setup and transportation 
time are negligible small and no backlogging is permitted. Finally, the demand 
rate D for the item is deterministic and constant over an infinite time horizon. 

To derive the optimal lot-sizing policy for the entire production system we use 
the criterion of minimisation of total relevant costs per unit time. The cost function 
C(Q, M) where M = (mj, m 2 , m^) is composed of inventory holding costs 

(first and second term), setup costs (third term), and transportation costs (last 
term) as follows (see [2, 5]): 



s 



C(Q,M) = Q-^ 



s=l 








( 1 ) 





91 



«s = 



max[Ps;Ps+i] 






S=1 



t^s+1 



s=l 



Furthermore, we have to respect the limited storage capacity. In order to formu- 
late the corresponding constraint, it is helpful to look at the development of the to- 
tal inventory per production lot between two consecutive stages s and s + 1. As 
easily can be shown, the maximum inventory level is Q [l+Pg^.j/Pg (l/mg-l)], 
when Ps^Ps+l> and is not allowed to exceed Lg. Since the maximum inventory 
level is given by Q [l+Pg/Pg^j (l/mg-l)], when Pg^.^ > Pg, the general expression 
for the storage capacity restriction is: 

Q<4/[l + X 3 (l/m,-l)] where = m^P^ /P 3 +i;P,+, /PJ (2) 

Note, the total cost function C(Q, M) is convex for all positive real values of Q 
and (mj; m 2 ; ... ;mg) of the vector M. To derive the optimal lot-sizing policy it is 
necessary to minimise Eq. (1) subject to (2). 



3 Feasible and Optimal Transportation Frequencies 

The total cost function depends on both the lot size as well as the transportation 
frequencies. For the time being, let us ignore the storage capacity constraints. Ob- 
viously, a certain integer mg-value is only optimal for a particular range of lot 
sizes. In order to determine the latter range, one has to look at that part Hg(Q, mg) 
of the total cost function which is influenced by mg. Clearly, for a given Q it is ad- 
vantageous to choose the integer shipment frequency mg instead of mg + 1 if: 

Hg(Q,mi)<H3(Q,mi+l) (3) 

Inequality (3) can be used to calculate the critical value of the lot size Q for 
which the optimal integer mg-value changes from m* to m^ + 1. Exactly at that 
point will both transportation frequencies result in the same cost and (3) is ful- 
filled as an equation. Now, solving this equation for Q gives: 

Qj(m‘;m‘s+l) = ^Ts (m^+l)/as (4) 

Since the same consideration can be done with m^ - 1 and m^, m^ is the opti- 
mal integer number of transports if Q lies in the range: 

Qj(m* -l;m 3 )<Q<Qs(m*;ms +1) (5) 

So far we have ignored constraint (2). Taking this restriction into account, the 

largest feasible lot size Qs(nis) for a particular mg is: 

Q^,(mg)-L3/[(l-:^s) + ^s/m,] (6) 

As can be seen, the denominator in Eq. (6) is always larger than zero. Hence, 

for each number of transports a feasible value Qs(nis) ®^ists. Moreover, Qs(nig) 
becomes larger the higher the considered mg-values are. Consequently, a specific 
shipment frequency m^ gives the lowest possible costs according to Hg(Q, mg) 





92 



only up to Q^(nig; nig+1) = min{Qg(mg; +1), Qg(nig)}. Following this line of 
argument, we can deduce that by incorporating the capacity constraint, the modi- 
fied range of Q- values for which m ^ is optimal can be described as follows: 

(m* - 1; m^ ) < Q < (m‘ ; m^ + 1) (7) 



4 Optimal Lot Size Policy 

Subsequently we carry out the optimisation for the modelled planning problem in 
two phases. In the first phase an initial feasible solution is generated. Afterwards, 
this solution serves as starting point for the second phase in which we obtain the 
optimal solution. 

Now, we turn to the first phase. For the time being let us ignore the restriction 
(2) as well as the integrality constraints on the m^- values in order to determine the 
global optimum for Q. For this, it is helpful to use a cost function which represents 
a lower bound on the total costs for given production lot sizes. Formulating the 
mentioned function requires solving the extremal equations 3C(Q, M)/3mg = 0 for 
each mg and inserting the resulting expressions into Eq. (1). After some simplifica- 
tions we obtain the following cost function depending only on Q: 

S ^ ^ (8) 

C(Q) = 2D.XV^ + QPD + y- 

S=1 ^ 

Thus, the global optimal lot size is given by Q**= The next step consists 

of calculating the unconstrained optimal stage-specific integer transportation fre- 
quencies mg for Q**. In order to achieve this, it is necessary to go back to (5), to 
solve at first the left hand side of this inequality for m^ separately. After that, we 
solve for the right hand side. Combining those expressions results for a given lot 
size, e.g. Q**, in (For a detailed derivation see [2]): 




The up- and down-arrow in (9) denotes the rounding to the closest integer to e. 
Note, the transportation frequencies mg(Q**) for s = 1, 2, ..., S, denoted as M**, 
minimise the total cost function given Q**. However, usually Q** is not the opti- 
mal lot size for M**. Given a vector M, the corresponding optimal lot size is ob- 
tained by solving the extremal equation dC(Q, M)/dQ = 0 for Q: 

Q*(M) = 

Inserting M** in Eq. (10) gives Q*(M**). But, Q*(M**) does not necessarily 
represent a feasible solution. If Q*(M**) is smaller than Q^(M**) = min{Qj(m 2 *); 
Q 2 (m 2 *);...; Qs(m 3 *)} then Q*(M**), defined as Q^, constitutes the initial feasible 






93 



solution, otherwise it is Q**(M**) = Q^. The initial feasible solution leads to costs in 
size ofCa = C(Q^ M**). 

After we have found this solution, the second phase of our optimisation proce- 
dure begins. Clearly, an upper bound on the cost of the optimal solution is given 
by C^. Moreover, using Eq. (8) we can establish a lower bound (Qj) and an upper 
bound (Q^) within which the optimal lot size lies. This is achieved by finding 
those Q-values yielding a cost equal to C^. We choose Qj as the starting point in 
searching for the optimal lot size. Inserting Q| in Eq. (9) we receive the corre- 
sponding unconstrained stage-specific optimal transportation frequencies. Note, 
any pair of Q| and mg(Qj) has to satisfy restriction (2). In case that (2) is violated, 
we have to search for any stage the smallest integer mg- value which guarantees the 
adherence of (2). For this, assume constraint (2) is binding. Now, solving this 
equation for mg and rounding the right hand side up to the next integer value 
gives: 



m^(Q,) = r(5ts Ql)/[Ls -Qi (l->^s)]l 

Since mg(Qj) represents the minimum number of transports regarding Qj, the 
optimal feasible values of the stage-specific shipment frequencies are given by: 

mt(Q0 = max {m*(Qj) ; m,^(Qj)} (12) 

Recall the fact that for a given vector M only the lot size Q*(M) minimises the 
total cost function. However, due to the storage capacity constraint a realisable lot 
size can not be larger than Q*^(M). Hence, after obtaining the vector M'^(Qj) = 
(m|(Qj); m^CQj); . . .; ms(Qj)), the restricted optimal lot size is Q"‘‘ = min {Q*(M‘*‘); 
Q^(M'^)}. The corresponding costs are C(Q"^; M'^(Qj)). This costs have to be com- 
pared with those of the initial feasible solution. If C(Q'^; M'^(Qj)) > we retain 
our existing temporary optimal solution. Otherwise, our actual solution becomes 
the new temporary optimal solution. 

Nevertheless, it is possible that higher lot size solutions exist leading to lower 
costs than our temporary optimal solution. To obtain the critical Q-value for which 
the vector is no longer optimal, we must determine for each element m^ of 
the vector M"*". Note, an additional shipment between two adjacent stages can re- 
sult in lower costs Hg(Q, mg)only if the production lot size is at least equal to Q^. 
For the smallest Q^-value the corresponding shipment frequency m^ has to be in- 
creased by one while all other elements of the vector remain at the same value. 
By using Eq. (10), we calculate now the optimal lot size for the modified vector 
M^od* evaluate the new solution we have to compute the resulting costs ac- 
cording to Eq. (1) and compare them with those of our temporary optimal solu- 
tion. Obviously, if the latter costs are larger than the costs of the actual solution 
we define the new solution as our new temporary optimal solution. Now, an itera- 
tive procedure follows. The steps which we had described for must be re- 
peated for The algorithm ends if a vector M is found whose optimal lot size 
Q*(M) is equal or greater than the upper bound Q^. Hence, the best solution so far 
is also the optimal solution for the given planning problem. 





94 



5 Conclusions 

The objective of this paper was to determine the Economic Production Quantity 
for a multi-stage production system with limited storage capacity. For this purpose 
a two-phase algorithm was developed which minimises the sum of setup costs, in- 
ventory holding costs, and transportation costs. In contrast to most known ap- 
proaches we have taken into account the transportation processes between con- 
secutive stages explicitly. The proposed procedure determines not only the 
optimal production lot size but also the stage-specific shipment frequencies. Due 
to the formulation of the storage capacity restriction we were able to realise possi- 
bilities for considerable cost-savings compared with a conventional formulation of 
the restriction which ignores the influence of transportation activities. 



References 

1. Bogaschewsky R, Buscher U and Lindner G (2001) Optimizing Multi-Stage 
Production with Constant Lot Size and Varying Number of Unequal Sized 
Batches. Omega 29:183“191 

2. Buscher U and Lindner G (2002) Simultane Bestimmung von Fertigungs- und 
TransportlosgroBen bei beschrankter Lagerkapazitat - Planung stufenbezogener 
Transporthaufigkeiten. Working Paper, Bayerische Julius-Maximilians Univer- 
sity Wurzburg, Germany 

3. Drezner Z, Szendrovits AZ and Wesolowsky GO (1984) Multi-Stage Produc- 
tion with Variable Lot Sizes and Transportation of Partial Lots. European Jour- 
nal of Operational Research 17:227-237 

4. Szendrovits AZ (1975) Manufacturing Cycle Time Determination for a Multi- 
Stage Economic Production Quantity Model. Management Science 22: 
298-308 

5. Szendrovits AZ and Drezner Z (1980) Optimizing Multi-Stage Production with 
Constant Lot Size and Varying Number of Batches. Omega 8:623-629 





A Reverse Logistics Modei with integer Setup 
Numbers 



Knut Richter', Imre Dobos^ 

' European University Viadrina, Frankfurt (Oder), Grosse Scharmstr. 59, D- 
15230 Frankfurt (Oder), Germany, richter@euv-frankfurt-o.de - corresponding au- 
thor 

^ Budapest University of Economics and Public Administration, Department of 
Business Economics, H-1053 Budapest, Veres Paine u. 36, Hungary, ido- 
bos@mercur.bke.hu 



Abstract. A production-recycling system is investigated. A constant demand can 
be satisfied by production and recycling. The used items are bought back and then 
recycled. The not recycled products are disposed off A model is examined with 
EOQ-type inventory holding costs and linear waste disposal, recycling, production 
and buyback costs. It will be shown that under these circumstances the mixed 
strategies are dominated by the pure strategies. The paper generalizes a former 
model proposed by the authors for the case of integer recycling and production 
batch. 

Keywords: EOQ model, Reverse Logistics, Production, Recycling, Waste disposal. Cost minimization 



1. Introduction 

A producer serves a stationary product demand occurring at the rate D > 0. 
This demand is served by producing new items as well as by recycling some part 0 
< < 7 of the used products coming back to the producer at a constant return rate 
d = (xD, 0< a< assumed that the producer is in the situation to buy back all 
used product to recycle and/or to dispose off them. The parameters S and a are 
called marginal use rate and marginal buyback (return) rate, respectively. The 
remaining part of the non-serviceable products (1-^d will be disposed off. (1-5) is 
called marginal disposal rate. The length of the production and recycling is T. 

The inventory stocks for serviceable products from the production and recy- 
cling processes (PRP) and for the non-serviceable items are determined. The fol- 
lowing costs inputs will be used: The setup costs of recycling Sr, the setup costs of 
production Sp, the waste disposal cost for (1-5) oD T is C^, the linear production 
cost for (l-5a)D’T is Cp, the linear recycling cost for 5 ad T is Cr, the buyback 
cost for a D T is Cr, holding cost of serviceable items hs and holding cost of non- 
serviceable items hn. All of these parameters are positive. The following notations 
are applied to formulate the models: the number of recycling lots, positive integer 
m, the number of production lots, positive integer n and the lot size x = DT. If 
these variables are fixed then the demand is satisfied by recycling a5x units in m 



o5 X 

lots 01 size 

m 



and by producing (1-a^ x units of new items in n lots of size 



{l-a5)x 



. The production and recycling rates are 1/p- D and 1/y -D. 



n 




96 



First, only the EOQ-related setup cost and the holding cost parameters are con- 
sidered. The overall cost for a production and recycling cycle Tis [12] 

2D m 2D n 2D a 

The per time unit cost is then 

^ n X 

\ / I. 



C^{x,m,n,a,S) = ^ = — ^ 

T X 



^+~V{m,n,a,5) 



( 2 ) 

(3) 

(4) 



with V{m,n,a,S)={h^^rhi^-'j^ 5^ 

m n 

as per unit total holding cost. The formula (3) can be also presented as 

V(m,n, a, S) = Hr (a, S) —+Hp(a,S) -+ H„ (a, S) , 
m n 

where the factors express three different types of inventory holding cost occurring 
in our system: those induced by collecting and recycling used items, those induced 
by the production of new items, and those induced by collecting those items which 
were used for the first time. Later we need the relationships 

HrM Jip(oc,S) 



M(a,S) = - 



and N{a,S) = - 



(5) 



Let now the non EOQ-related cost inputs be included. The sum of linear waste 
disposal cost, recycling cost, production cost, and buyback cost per unit time is 
given by the fimction [12] 

C^{a,d) = Ciy {\-S)c(D+Cp S(a)+Cp {l-da)D + CB aD . 

Hence, the overall per time imit cost is C(x,m,n,a,S) = C^(x,m,n,a,S)+Cfj(a,S) 
and the corresponding optimal decision (solution), i.e. the lot size x > 0, the setup 
numbers m, ne {1,2,...}, buyback rate aand waste disposal rate S, have to be de- 
termined. 

If the cost function CA(x,m,n,a,^, which is obviously convex in x, is to be 
minimized in x > 0 for fixed m, n > 1 and fixed a and d, then the cost minimal lot 



sizes x(m,n,a,S) can be derived as 



_ \2D[Sp'in\-Sp rt) 
V{n%i%a,8) 



and the minimal 



cost is C A {m , «, a, S) = yj2D \Sp m-¥Sp ‘n)'V(m,n,a,S) . 

These results contain the pure strategies, i.e. no recycling (a=S=0) and no pro- 
duction (a=S=l). The optimal costs can be written for these cases, as 

Cp = (0,1, 0,0) = pDSph,{l-/3), Xp = pDSplh,{l-li) 

(o) 

Cp = Cj (i,o,i,i) = ^2DSp (h,+h„Xi-r), Xp = pDSp i(h, + h„ )(i - r) 

where parameter Cp is the minimal cost for the pure strategy production, and Cp 
for recycling. 

The two problems to be studied are the integer program 
Ca = min{c^ (m, n,a,d):m,ne {l,2,.. .}} 



( 7 ) 





97 



and by this way C(d:r,(J) = min{c(m,«,a,^):m,«€ and the continuous 

program min{c(cir, S): a e [0,ll S e [0,l]}. (8) 

EOQ type reverse logistics problem was extensively studied in the literature [1- 
10]. The model discussed in this paper was introduced in paper [11]. The extended 
model [12] has examined the continuous case of setup numbers. 

The following paper is organized as follows. The next section investigates the 
model for integer setup numbers (1). In section 3 the optimal buyback and recy- 
cling strategy is provided for the integer problem (2), and the last section summa- 
rizes the results of this paper. 



2. The case of integer numbers of lots for production and recycling 

Now we will minimize the cost function C^{m,n,a,S) in order to determine 
the optimal number of lots. After some calculation this cost function can be writ- 
ten in the following the form [12] 



Cj(m,n,a,S) = j2D 






m 



( 9 ) 



where 

A[a,S)=Sj^ Hp(a,S), ^a,S) = Sp Hp(a,S), 

da,5)--Sp H^{a,5), E{a,5)=Sp H^{a,S), E{a,5)^Sp‘Hp{a,5)^-Sp‘Hp{a,S)' 



To solve this problem we can introduce a relaxed auxiliary problem (meta- 
model) ([5]). The continuous solution for the lot numbers m(a,^ and n(a,^ are 



,S),n^(a ,S)) = 



1 ,- 



1 - aS 



aS 



Xp 

(i,i) 



aS 

1 - a 5 



^•ylN{a,S),l 



{a,S)& I 
J 
K 



where 

I=la,S\ ^<<^(40<ar<l,0<5<l}, J={(o;^]| Jj(a)<5<^^(a),0<«<I,0<<5<l}, 

/;: = {(ar,^)|<y>«J 2 («), 0 <«<l, 0 <<y<l}, and = . 

oc Xp+Xp/ ^M(a,S) 



S2{a) = -- 
a , 



Xr 

] SpM(a,S) 



The regions /, J and K were determined in paper [12]. The continuous optimal 
inventory cost function after eliminating the setup numbers is 

\(\-aS ) Cp +aS Cp I {a, 5 ) (a,S)e I 

Cj{a,S)=\ yj 2 D{Sp + Sp)V i\,\,a, 5 ) {a, 5 )ej' 

[aS Cp +{\-aS )Cp I ^N{a, 5 ) {a, 5 )e K 

The optimal solution is automatically integer on set J. In other cases the opti- 
mal integer solution is not necessarily reached onm = 1 ox n = 1, i*e* iiot necessar- 
ily on the boundary of (m,n) > (1,1). Let therefore a feasible solution with m = 1 
or « = 7 be called boundary. The next example shows such a case. 





98 



Example 1. Let D=1,000, p=y= 2/3, a = 9/10, S = 1/2, Sp =1,350, Sr =440, 
hs = 850 and h„ = 80. Then the optimal continuous solution for (m,n) is (1.484,1) 
with inventory costs 35,547.4, while the optimal integer solution is (3,2) with cost 
value 35,794.3. The values of the cost function at points (2,1) and (1,1) are of 
35,896.6 and 36,157.6. The costs of the optimal integer boundary solution (2,1) 
are with 0.3 percent higher than those of the optimal solution (3,2). 

The same situation occurs in other models of reverse logistics, for example in 
the model of Teunter [9]. This model is similar to that proposed in our article, with 
the difference that this article investigates a situation with infinite production and 
recycling rates and the inventory holding costs for serviceable items are different 
for remanufactured and manufactured items, i.e. the unit holding costs for manu- 
factured products are higher than those of remanufactured items. Let us show an 
example in this model that the manufacturing and remanufacturing lot numbers 
can be strictly greater than one in an optimal integer solution. We take the nota- 
tions and numerical example of article [9]. 

Example 2. Let demand X= 1,000, the manufacturing and remanufacturing setup 
costs Km =750, Kr =100, the unit holding costs for manufactured, remanufactured 
and non-serviceable items = 200, hr = 50 and h„ = 20, the return rate r = 0.9 
and the reuse rate u = 0.48. Let us assume that the remanufacturing lot number is 
R and for manufacturing M. Then the optimal continuous solution for (R,M) is 
(1.489, 1) with inventory costs ACf = 10,845.2, while the optimal integer solution 
is (3,2) with cost value Ad = 10,887.6. This situation is displayed in fig. 1. The 
lines are the isocost lines for the optimal continuous A(f and integer solutions 
ACf. The values of the cost function at (1,1) and (2,1) are of 10,964.7 and 
10,910.8. The costs of the optimal integer boundary solution (2,1) are with 0.213 
percent higher than those of the optimal solution (3,2). 

The structure of the optimal boundary solution is provided by 
Theorem 1 [5]. The optimal boundary solution is {m^(cx,^ , (a, 5)) with 



(i) {a,S)e 1 => m^{a,S) = \, n^{a,S) = 



{l-aSf 



M(a,S)-¥- 



1 

+— , 
2 



(iii)(a, J)e K =» m^{a,S) = 



2s:2 



a S 



1 1 



•W(a,^)+- + 



, n’’{a,S) = \, 



xl '-'42 

where [xj denotes the maximal integer not greater than x. 

The optimal boundary inventory cost function is 

\ nyo^oj m[a,o) 





99 



Figure 1. An optimal not boundary solution in the model [9] 




For a wide range of situations the optimal boundary solutions are really opti- 
mal. However, as we have seen, instances can be found, where such solutions are 
not optimal. The upper bound for the deviation from the optimum is given by 

Lemma 1 [5]. Let C\{a,S) denote the minimal value for the optimal boundary 

solution and let C^(a,(J) be the global minimum. Then the relation 

C'’Aa,S)-C%a,d) ^ 1 ^ . . . . . ^ ^ 

— — < — holds, i.e. relative error is less than 2.1 %. 

CW) 

If the continuous optimal setup numbers are integer for values {m((X,S}, n(a,S)), 
then the relative error is zero. Let us call this points (a,S) diS switching points. Be- 
low these points (lines) are determined. 

Lemma 2 [5]. If the boundary property holds there are two sets X and Y of switch- 
ing points (oc,^ with the properties 

(i) {(a^: A(a^ = n(n^l)iB(a^+D(a^l w=L2,...;and 

C^(a,S) = C^il,n^{a,S\a,s)= cJ^,n^{a,S)^-\,a,8] for (oc,^€XdinA 
(hi) Y= {(0,5): B(a,S) =m(m+l) {A(a,S)^C(^^^ w=7,2,...;and 

C°^{a,S) = C^(m\a,d\\,a,d)= C^m‘>{a,S)+lXa,s) for (a^eY. 
Remark. The sets X and Y separate the sets I and K into such subsets of identical 
optimal setup numbers m^(a,S) and n^(ot,S). Then / = and . 

For example, the sets X and Y consist of the union of the following lines 

and Y = S = <?'"(«), 0 < or< 1, 0 < < l}, where 






100 



<r(a) =- 1 — ^ — = . 

a 1 \-MM 

^ ^ y»i(w+l) Sp M(a,S) 

3. The dominance of pure strategies 

In paper [12] we have shown that the EOQ-related and the sum of EOQ-related 
and non EOQ-related costs are dominated by the pure strategies for the case of 
continuous production and recycling setup numbers, i.e. for the inventory cost 

Cj(a,S) > min{Cp; q} and for the total cost model 

C{x, m, n,a,S)> min{Cp + D • Cp, Q 4- D • (Q + Q )} . 

It is obvious that the continuous solutions to these models have lower costs than 
the boundary integer and the global integer solutions: 

C\{a,5) > C^^{a,S) > Cj{a,5) > min{Cp; Cp}, and for the linear costs 
{a, S) + {a, S) > (a, S) + (a, S) > Cj (a, S) + (a, 8) > 

> min{Cp -\-D'Cp,Cj^ +Z) (Cp +Cp)} 

By this last inequality a proof is given for the 
Theorem 2 [12]. The optimal inventory holding and production-recycling strategy 
in this production-recycling model is a pure strategy: either to produce to meet the 
demand {ot = 8" = 0) oxio buy back and to recycle all used product without pro- 
duction {ot ^ 8 = 1). The optimal pure strategy can be simply found by compar- 
ing the pure strategies. 

Example 3. l.QiD=l,000, P = 7=2/3, Sp =1,350, Sr =440, =850 and h„ =80. 

Then the inventory holding costs of recycling is 16,516.7 and those of production 
33,326. 7. It is economical to recycle with buyback of all used items: (f=8=l. 

Example 4. Let D=1,000, p =2/5 y= 2/3, Sp =360, Sr =440, h, = 85 and h„ = 
80. Then the inventory holding costs of production is 6,059.7 and those of recy- 
cling 6,957.01. It is more effective to produce and not to recycle: (f=8=l. 

4. Conclusions and further research 

In this paper we have investigated a production-recycling model. By minimiz- 
ing the inventory holding costs it was shown that one of the pure strategies (to 
produce or to recycle all products) is optimal. A similar proposition can be ob- 
tained minimizing the total EOQ and non-EOQ related costs. A similar result was 
obtained by Richter [3] in a waste disposal model with remanufacturing and by 
Dobos and Richter [1 1, 12] in a production and recycling model. 

Probably these pure strategies are technologically not feasible and some used 
products will not return or even more as the sold ones will come back, and some 
of them will be not recycleable. This kind of generalization of this basic model 
could be the introduction of an upper bound on the buyback rate which is strictly 
smaller than one. In such a case a mixed strategy would be economical compared 
to the pure strategy “production”. The answer to this question is left to a next pa- 
per. 





101 



References 

[1] Richter, K. (1996): The EOQ repair and waste disposal model with variable 
set-up numbers, European Journal of Operational Research 96, 313-324 

[2] Richter, K. (1996): The extended EOQ repair and waste disposal model. In- 
ternational Journal of Production Economics 45, 443-447 

[3] Richter, K. (1997): Pure and mixed strategies for the EOQ repair and waste 
disposal problem, OR Spektrum 19, 123-129 

[4] Richter, K. Dobos, I. (1999): Analysis of the EOQ repair and waste disposal 
model with integer setup numbers. International Journal of Production Eco- 
nomics 59, 463-467 

[5] Dobos, L, Richter, K. (2000): The integer EOQ repair and waste disposal 
model - further analysis. Central European Journal of Operations Research 8, 
173-194 

[6] Schrady, D.A. (1967): A deterministic inventory model for repairable items. 
Naval Research Logistic Quarterly 14, 391-398 

[7] Nahmias, N., Rivera, H: (1979): A deterministic model for repairable item in- 
ventory system with a finite repair rate. International Journal of Production 
Research 17(3), 215-221 

[8] Mabini, M.C., Pintelon, L.M., Gelders, L.F. (1998): EOQ type formulation 
for controlling repairable inventories. International Journal of Production 
Economics 54, 173-192 

[9] Teunter, R.H. (2001): Economic Ordering Quantities for Recoverable Item 
Inventory Systems, Naval Research Logistics, Vol. 48, 484-495 

[10] Dobos, I., Richter, K. (1999): Comparison of Deterministic One-Product Re- 
verse Logistics Models, in: Hill, R., Smith, D. (Eds.): Inventory Modelling: A 
Selection of Research Papers Presented at the Fourth ISIR Summer School 
(1999), Exeter 1999, 69-78 

[11] Dobos, L, Richter, K. (2002): A production/recycling model with stationary 
demand and return rates. Central European Journal of Operations Research, to 
appear 

[12] Dobos, L, Richter, K. (2002): An extended production/recycling model with 
stationary demand and return rates, , Proceedings of the 12^ International 
Working Seminar on Production Economics, Vol. 1, 47-60, Igls/Innsbruck 





Lotsizing in a Production System with Rework 
and Product Deterioration 



K. Inderfurth, G. Lindner and N.P. Rahaniotis 

Otto-von-Guericke University Magdeburg, Faculty of Economics and Manage- 
ment, D-39016 Magdeburg 



Abstract. Producing new or recovering defective products often takes place on a 
common facility, with these activities carried out in lots. Consequently, there is a 
necessity to coordinate the production and rework activities with respect to the 
timing of operations and also with regard to appropriate lot sizes for both proc- 
esses while completely satisfying a given demand. Thereby, it has to be taken into 
account whether the state of defective items that await rework worsens in the 
course of time or not. In this paper we present an EPQ model which addresses all 
of these aspects. Considering set-up and inventory holding costs, optimization al- 
gorithms are developed covering different planning situations. 



1 Introduction 

For manufacturing purposes most often a quite sophisticated equipment is used. 
However, some defective units are fabricated even by production systems compri- 
sed of high quality facilities. In many cases defective items incorporate substantial 
values and hence there is an economical incentive to rework those products into an 
“as new” condition. Besides this motive, rework activities can also be mandated 
by existing laws or due to the green image a company will present. 

Often rework takes place on the same machinery used for producing new items. 
Moreover, in such a context, a company has two sources for satisfying the demand 
of a certain product. Namely supplying reworked units or new produced items. 
Consequently, there is a necessity to coordinate the production and rework activi- 
ties with respect to the timing of operations as well as with regard to appropriate 
lot sizes for both processes. Thereby, it has to be taken into account whether the 
state of defective items that await rework deteriorates in the course of time or not. 
Perishable reworkable defectives occur e.g. in the pharmaceutical industry [1]. 

Although determining lot sizes in the presence of deteriorating items has al- 
ready received a lot of attention, all approaches dealing with this problem assume 
(implicitly) that defective items have to be disposed of (see e.g. [2, 6]). Likewise, 
joint planning of production and rework lot sizes has attracted many researchers 
(see e.g. [4, 5]). Nevertheless, the application of the available models is restricted 
to situations where deterioration has no influence on lotsizing decisions. 




103 



To the best of our knowledge, the only approach which integrates both men- 
tioned streams of research was suggested in [7]. There a single-facility manufac- 
turing system is considered where items are produced and, in case of insufficient 
quality, reworked in subsequent lots. Due to time-based deterioration of recover- 
ables it is assumed that rework processing time and costs per item increase line- 
arly with the time span that an item is held in stock and awaits rework. Defects 
occur stochastically, and the objective is to determine the production lot size in 
such a way that the expected system profit, given that all serviceable products can 
be sold at a given price, is maximized. In [7] a closed- form expression for the ob- 
jective fimction is given which can be used to determine the optimal production 
lot size numerically. 

In contrast to this approach we address a more complicated cost minimization 
problem with additional feasibility constraints. By restricting defects to be caused 
by a deterministic process we are able to develop closed-form expressions not 
only for the objective function but also for the optimal production lot size. These 
formulas can be used for an easy computation of the optimal lot size as well as for 
gaining general insights into how this lot size depends on the various parameters. 

The remainder of this paper is organized as follows. In section 2 we give a de- 
tailed description of the considered planning situation. Subsequently we analyze 
the so-called basic approach which deals with non-deteriorating reworkable items. 
In section 3 we extend the basic approach by incorporating different aspects of de- 
terioration. Finally, conclusions are outlined in section 4. 



2 Fundamentals of the Model 

2.1 The Planning Situation 

In this paper we present a static deterministic lot-size model for a single product, 
focusing on a two-stage manufacturing system within which production as well as 
rework activities are carried out. The main characteristics of our model can be 
summarized as follows. 

The considered manufacturing system faces a constant and continuous demand 
D over an infinite time horizon. Moreover, the production and rework activities 
are carried out in lots of size Qp (Qj.) at the same facility and with finite rates Pp 
(Pj.). Hence, we have to ensure fiiat the production line is used for only one proc- 
ess at a time while satisfying the entire demand without backlogging. Each 
changeover from production to rework and vice versa causes a set-up cost (set-up 
time) of Sp or Sj. (tp^ or tjp), respectively. 

According to our coordination policy, one inventory cycle starts with fabricat- 
ing a production lot. Due to an imperfect underlying process this quantity contains 
only a fraction a (0 < a < 1) of serviceable units. A portion (1-a) of units are de- 
fectives and will be reworked. After reprocessing these items are always as good 
as new. Disposal of defectives is not considered. During production, all service- 
able (reworkable) items are transported continuously to the so-called serviceables 
(recoverables) inventory. Demand is satisfied from serviceables inventory. For 





104 



every unit at the corresponding inventory a holding cost rate of h^ (hj.) per unit per 
unit time is charged. 

Defective units stay in the recoverables inventory until the rework process is 
carried out. This takes place directly after the preceding production run and the 
necessary set-up are finished. Once reprocessing of the rework lot, containing Qj. = 
(l-a) Qp units, has started, items are shipped according to a parallel movement of 
units to the serviceables inventory as long as all units of the actual lot are com- 
pleted. Afterwards, the next inventory cycle is initiated such that the first good 
quality item out of one production lot arrives at the serviceables inventory exactly 
at the moment where all previously delivered products are used up. 

Finally, for ascertaining the optimal production and rework lot sizes we use the 
criterion of minimization of total relevant cost per unit time. For assuring feasible 
solutions, the relation a Pp > D must always hold. 

2.2 The Basic Model 

The basic approach is featured by the property that the state of reworkable defec- 
tives does not change in the course of time while they wait to be reworked. Hence, 
the model considers the case of non-deteriorating items. In order to determine the 
optimal solution, we have to solve the minimization problem as stated below. 



C(Qp) = 0,5QpHD + FD/Qp-^Min! (1) 

Subject to: 

Qp^Qp=(trp+tpr)D/[l-D/Pp-(l-a)D/P,] (2) 

Qp>Qr=tprD/(«-D/Pp) (3) 

Qp>0 (4) 

where: 



H = (l-a)[(l-a)/Pr+l/Pp](h,-hs) + (l/D-l/Pp)h, 

F = Sp+S, 

The convex objective function (1) is comprised of inventory holding costs per 
unit time related to the recoverables as well as serviceables inventory (first term) 
and the second term represents the fixed set-up costs per unit time. Constraint (2) 
guarantees that the time span between two subsequent production runs is large 
enough to set-up the system for the rework as well as the production lot and to 
process both corresponding quantities. From (2) it can easily be seen that Pj. can 
take on values smaller than D. However, a feasible solution exists if and only if 
the rework rate fulfills the condition: Pj. > (l-a) D/(l-D/Pp). 

Moreover, restriction (3) ensures in case of Pj. > D that the time between the 
end of production and the moment where the corresponding serviceables inventory 
level hits zero, is sufficiently large in order to prepare the facility for the rework 
run. In fact, if Pj. < D occurs, constraint (3) is always looser than (2). 





105 



If we first ignore constraints (2) and (3), the optimal production lot size can be 
obtained by solving the extremal equation dC(Qp)/dQp = 0 for Qp. 

QJ=V2F/H (5) 

Finally, because of convexity of total cost function, the optimal solution of the 
constrained planning problem is given by: 

Qf =max{Q;;Q+;Q;+} (6) 



3 Modeling Aspects of Deterioration 

So far we have restricted our attention to the case of non-deteriorating defectives. 
However, in several practical situations this assumption does definitely not hold. 
Consequently, if the rate of deterioration is not sufficiently low or if there is no de- 
terioration at all, its impact on modeling the considered decision problem must not 
be neglected. In order to study the impact of deteriorating reworkables on ascer- 
taining the optimal lot size policy, we will extend the planning situation - as al- 
ready described - by the following assumptions. 

Since the state of defective units worsens in the course of time while they wait 
to be reworked, there will be differences in deterioration among items coming out 
of one production lot. However, in our model we do not consider each item indi- 
vidually, but use an average waiting time approach. Furthermore, we presume that 
the deterioration of recoverables exhibits a significant dependence on the produc- 
tion lot size, since with increasing this quantity the average waiting time of a de- 
fective item will increase. Due to the deterioration caused by waiting, the cost and 
time of rework operations necessary to bring the defective item into an acceptable 
state also will increase. 

Now, let us turn to the first modified planning situation, namely where the re- 
work costs Cj. per unit increase linearly with lot size Q . Specifically, we assume: 
Cf = + ^ra'Qp- relation Cj.^, represents the basic rework cost per unit 

whereas Cj^ denotes the additional rework cost per unit per unit incorporated in 
one production lot. Determining the optimal solution regarding the considered 
problem is quite easy. One simply has to replace H in the basic approach by Hj = 
H + 2 (l-a) Cj^ and the solution procedure, discussed in the previous subchapter, 
can be applied. Of course, since Hj is larger than H, the unrestricted optimal pro- 
duction lot size is smaller than its counterpart of the basic approach. 

Subsequently we take a closer look at the more realistic case where besides the 
rework cost per unit also the rework processing time per unit is influenced by the 
deterioration of recoverables. Equivalently to c^. we assume a linear dependence of 
the rework processing time per unit on Qp. Thereby, the rework rate is given by 
the relation: Pj. = l/(Xj,^ + Xj^ Qp). The symbol denotes the basic rework time 
per unit and x^ stands for the additional rework time per unit for each unit in- 
cluded in one production lot of size Qp. By substituting the expression for Pj. into 
Eq. (1) the modified total cost function is represented by: 

C(Qp) = 0,5 Q^ T D + 0,5 H„ D + F D/Qp ^Min! 



( 10 ) 





106 



where: 

T = 0,5 • (1 - a)^ • Tfa (hf -hj) and Hjj =Hj with =1/Xrb 
Moreover, constraints (2) and (3) change to: 

max{Qpu;Qp'^}<Qp<Qpo (11) 

with: 

Qpu / Qpo = Z - /+ - (t^ + tp,) /[(l - a) • Ti ] (12) 

where: 



Z = 0,5/Ti[(l/D-l/Pp)l/(l-a)-To] 

Note, for the planning situation considered here, a feasible solution exists if and 
only if the expression in the square root in Eq. (12) is larger than or equal to zero 
and if the Qp- value where (3) is binding is smaller than or equal to Q^. 

For determining the optimal solution regarding the minimization problem de- 
fined just now we have to distinguish three different cases depending on the rela- 
tion between the holding cost rates hj. and h^. 

If h^ = hg the optimal unrestricted production lot size can be obtained by substi- 
tuting H in Eq. (5) by Hjj. Since the cost function is convex in this case, the con- 
strained optimal solution is given by: 



qT=\ 



max{Qpu;Qp'^} 

Qp 

Qpo 



if Qp<max{Qpu;Qp'^} 

if max {Qpu ; Qp"^ } < Qp < Qpo 
if Qp > Qpo 



(13) 



It is worthwhile to mention that in case of hj. = hg neither a nor Pj. influence Eq. 
(10). Therefore, the fraction of defectives as well as the deterioration of recover- 
ables have only an implicit impact on total costs via the restrictions. 

When hj. is larger than hg Eq. (10) remains convex. However, solving the ex- 
tremal equation dC(Qp)/dQp = 0 for Qp in order to ascertain the unconstrained op- 
timal production quantity does now require to solving a cubic equation. This can 
be done using available standard methods like e.g. the Newton-Raphson-method. 

The last case to be analyzed is that of hj. < hg. Unfortunately, if this relation oc- 
curs, the objective function is no longer convex. In this case, for Qp > 0, the cost 
function can be shown to be either monotonously decreasing or to have a single 
local minimum followed by a single local maximum. Consequently, determining 
the constrained optimal solution can be done in the following way. First of all, it 
has to be checked whether there is a local minimum which fulfills (11) or not. If 
such a value exists, we have to compare its total costs with those related to the 
lower and the upper bound where (1 1) is binding in order to define the optimal so- 
lution. Otherwise only the two mentioned boundary values must be tested for se- 
lecting the cost minimizing solution. (For a detailed treatment of the planning 
problems discussed in this paper see [3]). 





107 



4 Conclusions 

In this paper we have presented a model which allows the simultaneous determi- 
nation of production and rework lot sizes in the presence of deteriorating rework- 
able defectives. To the best of our knowledge, there is no further lotsizing ap- 
proach available in this area of investigation. 

Of course, the optimality of the solution depends on the assumptions made. The 
relaxation of some of them allows for possible extensions of the model. For in- 
stance alternative forms of deterioration, more general rules for coordinating pro- 
duction and rework lots, and the inclusion of a disposal option could be analyzed. 



References 

1. Flapper SDP, Fransoo JC, Broekmeulen RACM, and Inderfurth K (2002) Plan- 
ning and control of rework in the process industries: a review. Production Plan- 
ning & Control 13: 26-34 

2. Goyal SK and Giri BC (2001) Recent trends in modeling of deteriorating in- 
ventory. European Journal of Operational Research 134: 1-16 

3. Inderfurth K, Lindner G and Rahaniotis NP (2002) Lotsizing in a Production 
System with Rework and Product Deterioration. Working Paper, Otto-von- 
Guericke-University Magdeburg, Germany 

4. Lindner G and Buscher U (2002) An Optimal Lot and Batch Size Policy for a 
Single Item Produced and Remanufactured on One Machine in the Presence of 
Limitations on the Manufacturing and Handling Capacity. Working Paper, 
Otto-von-Guericke-University Magdeburg, Germany 

5. Minner S (2001) Economic Production and Remanufacturing Lot-sizing Under 
Constant Demands and Returns. In: B. Fleischmann et al. (eds) Operations Re- 
search Proceedings 2000. Springer, Berlin et al., pp 328-332 

6. Misra RB (1975) Optimum production lot size for a system with deteriorating 
inventory. International Journal of Production Research 13: 495-505 

7. Teunter RH and Flapper SDP (2003) Lot-sizing for a single-stage single- 
product production system with rework of perishable rework defectives. OR 
Spectrum forthcoming 





Mathematical Programming Models for 
Strategic Supply Chain Planning and Design 



J. Kalcsics, M. T. Melo, and S. Nickel 

Fraunhofer Institut fiir Techno- und Wirtschaftsmathematik, 
Gottlieb-Daimler-Str. 49, D-67663 Kaiserslautern, Germany 



Abstract. Although facility location and configuration of global manufacturing 
and distribution networks have been studied for many years, a number of important 
real world issues has not received adequate attention in the literature. In this paper 
we propose realistic models for the strategic planning and design of global sup- 
ply chains. Our main concern is to provide comprehensive models which explicitly 
capture the essential elements of many industrial environments. A generic math- 
ematical programming model is described in detail which became part of Supply 

TP 

Chain Design, a module within mySAP Supply Chain Management, developed 
by SAP AG. Moreover, extensions to the dynamic situation are discussed. 



1 Introduction 

Structuring global supply chain networks is a complex decision making pro- 
cess. The typical inputs to such a process consist of a set of customer zones to 
serve, demand projections for the different customer zones, a set of products 
to be manufactured and sold, and information about future conditions and 
costs (e.g. transportation and production). Given the above inputs, compa- 
nies have to decide, among other things, where to locate new service facilities 
(e.g. plants, warehouses), how to allocate procurement and production ac- 
tivities to the various manufacturing facilities of the network, and how to 
manage the distribution of products. 

Although facility location and configuration of global manufacturing and 
distribution networks have been studied for many years, a number of impor- 
tant real world issues have not received adequate attention in the literature 
(see Bender et al [1] for a review of models, company-specific case studies, 
and decision support systems). For example, the supply chain structure typi- 
cally consists of arborescent structures usually limited to at most two echelons 
(e.g. plants and warehouses), a system of distribution channels between these 
echelons, and a relatively simple cost structure. Excluding a few cases, the 
primary shortcoming of such models arises from the fact that their results 
cannot be extrapolated to real world supply chains. Our objective is to fill 
this gap by proposing realistic models for the strategic planning and design of 
global supply chains. Our models are part of the software tool Supply Chain 
Design which assists decisions-makers in the design of supply chain networks. 
Supply Chain Design is a module within the software mySAP^^ Supply 




109 



Chain Management, developed by SAP AG, which supports the organization 
and operation of supply chain networks (see e.g. Bender et al [1]). Model 
extensions to dynamic planning horizons are also discussed. 



2 Strategic Supply Chain Planning 

We consider a general supply chain network where different products are de- 
livered to satisfy the requirements of several customer zones. Since a strict 
categorization of facilities into echelons is not required, the network may 
accommodate any number of different types of facilities (e.g. plants, central 
and regional warehouses), which already operate at the beginning of the plan- 
ning project. No restrictions are imposed on the number of different facility 
types and on the transportation channels used by the company for shipping 
its products. In other words, commodities can be transported between any 
type of facility. This allows, for example, the return of pallets or empties 
(e.g. bottles, printer cartridges) to be modeled. 

Figure 1 shows a supply chain network with suppliers, plants, distribu- 
tion centers (DCs) and customers. The arrows indicate the transportation 
channels that are available to ship the products. In addition to inbound and 
outbound transportation, products may flow between facilities of the same 
type and from customer locations to other facilities. The latter flows describe 
product recovery activities for the purposes of recycling, remanufacturing and 
re-use. 



Supplieis Plants Distribution Customers 

Centers 




Fig. 1* Example of a supply chain network 



For strategic supply chain planning we consider general networks such as 
that of Figure 1 in order to capture important practical aspects. Continuous 
as well as discrete mathematical models are useful to improve the perfor- 
mance of a supply chain. In particular, models should focus on the analysis 
and redesign of an existing network structure with respect to facility loca- 





110 



tion, procurement, production, distribution and transportation decisions (see 
Kalcsics et al [3]). 

While classical facility location models mainly focus on location-allocation 
decisions, we take the whole supply chain into account. This means that 
our models deal with the planning of supply chain activities such as those 
described above, and thus generalize many well-known location models. 

3 A General Location Model for Strategic Supply 
Chain Planning 

Locations for new facilities are selected among a pre-defined set of candi- 
date sites. Moreover, location planning may be carried out for different types 
of facilities simultaneously (e.g. plants and DCs). Regarding the production 
structure, each end product may have a bill of materials which describes the 
requirements for components, subassemblies and raw materials. In addition 
to production, commodities may be purchased from outside suppliers. The 
objective is to determine the optimal number of operating facilities and their 
locations, as well as the amount of products to procure, the amount of prod- 
ucts to manufacture and the flow of products throughout the network in such 
a way that the overall costs are minimized subject to various side-constraints. 

Before introducing a formulation of the problem, we first define the set 
L of facilities. This set includes the so-called selectable and non- selectable 
facilities. Selectable facilities form the set 5, a subset of L, and correspond 
to existing facilities as well as potential sites for establishing new facilities. 
The second category of facilities, the so-called non-selectable, forms the set 
L\S and includes all facilities that already exist prior to the planning period 
and which must remain in operation. Note that a non-selectable facility may 
also have demand requirements, that is, it may correspond to a customer 
zone. In addition, we also define the set P of all product types ranging from 
raw materials and semi-finished items to end products. Further notation is 
introduced next. 

Inputs 

BCi^p : cost of purchasing one unit of product p e P in facility i e L 
from an outside supplier 

ai^r,p • number of units of product r G P required to manufacture one 
unit of product p £ P in facility £ G L 

PCi^p : cost of manufacturing one unit of product p G P in facility i e L 
TCi^i'^p : cost of shipping one unit of product p e P from facility i e L to 
facility P eL{£^ P) 

SCt : costs derived from closing an existing facility ^ G 5 
FCi : fixed costs for setting up a new facility at site ^ £ S 





Ill 



di^p : demand for product p G P at facility ^ ^ L 
A\ : minimum number of selectable facilities to be operated 
A 2 : maximum number of selectable facilities to be operated 

Decision variables 

bi^p = amount of product p £ P purchased from an outside supplier 
at facility E £ L 

hi p = amount of product p £ P manufactured in facility (, £ L 
Vi^P^p = amount of product p £ P shipped from facility £ £ L to 
facility e £L{£^ f) 

^ _ J 1 if facility t is operating, i £ S 
^ ~ [ 0 otherwise 

Assuming that all inputs are nonnegative and no capacity limits are set, 
our formulation is as follows: 

MIN PCi,ph(,p 

teLp€P i€LpeP 

+E E E TCi^P^p Vi^i'^p -h Y,FCi6e + J2SCt{l-6e) (1) 

ieLPeL\{i}p£P ies ies 

s. t. 

^ '^(’,i,p + f^e,p — 

peL\{e} 

‘Vi,e',p + 0'(,p,r h(^r + dt p, i E. L,p E. P ( 2 ) 

e'€L\{(} reP 



(3) 

e'€L\{£} peP 

E i'eS (4) 

t€L\Sp€P 

Y,Kp<M6u (5) 

peP 

Y,bt,p<M6e, £eS (6) 

peP 

Ai<J2Si<A2 (7) 

t€S 

b(,P>^^ h,p>0, ve^t,^p>0, ieLJ' eL\{e},peP ( 8 ) 

<5,e{0,l}, i€S (9) 



The total costs to be minimized are described in (1). These range from 
procurement, production, and transportation costs to fixed charges for open- 





112 



ing and closing facilities. Note that the latter costs may also include operating 
costs. Furthermore, handling costs for incoming and outgoing goods in a given 
facility can easily be incorporated in (1). Constraints (2) are conservation of 
flow equations, where the inbound flow to a facility £ with respect to some 
product p results from procurement and production activities at the facility 
plus the total product flow from other facilities £' ^ t. The outbound flow in 
equations (2) consists of the product demand to be satisfied at facility t, the 
production of new materials using product p as raw material, and the total 
product flow to other facilities £' ^ £. Inequalities (3), (4), (5) and (6) ensure 
that transportation, production and procurement activities are only allowed 
in operating facilities. The fixed term M denotes the maximum amount that 
may be required by any facility with respect to all product types. The calcu- 
lation of M takes not only the external demand di^p into account, but also 
the secondary requirements for a product p that may arise due to produc- 
tion activities where p is used as a raw material or intermediate product. 
Finally, constraints (7) impose a lower and upper limit on the total number 
of facilities to be selected from the set S. 

The above formulation describes a large scale Mixed Integer Problem 
(MIP) which is solved for reasonable size problems with the commercial 
package ILOG CPLEX 7.0 using the C+-h modeling library ILOG Concert 
Technology 1.0 (ILOG [2]). 

A natural extension of the above MIP formulation is to consider capacity 
constraints. These are modeled in terms of consumption of resources required 
for manufacturing and handling commodities. Resources can be site and prod- 
uct independent meaning that a given (production/handling) resource may 
be used by different products in different facilities. This generalizes the clas- 
sical way capacity availability is modeled in facility location problems. In 
addition, overtime work can be considered by allowing extension of the avail- 
able resources at the expense of additional costs. The decision maker is also 
given the possibility to allow external demand not to be completely met. In 
the case that part or all of the demand of a customer zone for a given product 
is not satisfied, a penalty cost is charged. All these extensions are included 
in the software module Supply Chain Design. 

4 Outlook: Extension to Dynamic Situations 

Since the establishment of facilities is a comprehensive project requiring sub- 
stantial investment capital and coordination of all operational aspects in- 
volved in setup activities, it is often necessary to smooth the relocation pro- 
cess and eventual phase-out of facilities in a dynamic multiperiod process. A 
gradual relocation of facilities in terms of shifting capacity from an existing 
site to the site of a newly established facility has the advantage of not dis- 
rupting supply chain activities and allows relocation plans to be carried out 
smoothly. For example, in some cases it is not possible to relocate a large 





113 



facility such as a plant in one step, both from an investment and operational 
point of view. Instead, it is preferable to gradually transfer production ca- 
pacity from the current location to a new site. Such a smooth transition also 
allows a better management of the required investment capital. Instead of in- 
vesting a substantial amount at once and thus putting great financial strain 
on the company, capital expenditures can be distributed among several time 
periods. Another practical situation to which a relocation scenario applies 
concerns the merging of companies. In this case, formerly separated supply 
chains are consolidated and a joint system structure needs to be planned. 
This entails shutting down some of the existing facilities and concentrating 
capacities in new locations. 

Currently, we are focusing on the extension of our static models to the case 
of dynamic relocation of facilities (see Melo et al [4]). The new mathematical 
programming models capture a number of important practical aspects and 
focus on the following strategic issues: 

• Where and when should a partial or total relocation of existing facilities 
take place? 

• Using the available budget, how gradually should the capacity of existing 
facilities be reduced and when should the transition to newly established 
facilities be made without disrupting the supply chain activities? 

• How should the supply chain be operated in each planning period with 
respect to the provision of materials, storage and shipment of products 
so that demand requirements are satisfied at least cost? 

References 

1. T. Bender, H. Hennes, J. Kalcsics, M. T. Melo, and S. Nickel. Location Software 
and Interface with GIS and Supply Chain Management. In Z. Drezner and H. W. 
Hamacher, editors. Facility Location: Applications and Theory^ chapter 8, pages 
233-274. Springer- Verlag, Berlin, Heidelberg, 2002. 

2. ILOG Optimization Suite. ILOG, Inc., Incline Village, Nevada, 2000. 
http://www.ilog.com/products/optimization. 

3. J. Kalcsics, T. Melo, S. Nickel, and V. Schmid-Lutz. Facility Location Decisions 
in Supply Chain Management. In K. Inderfurth, G. Schwodiauer, W. Domschke, 
F. Juhnke, P. Kleinschmidt, and G. Wascher, editors. Operations Research Pro- 
ceedings 1999^ pages 467-472. Springer- Verlag, Berlin, 2000. 

4. M. T. Melo, S. Nickel, and F. Saldanha da Gama. Dynamic multi-commodity 
capacitated facility location: A mathematical modeling framework for strategic 
supply chain planning. Technical report, Fraunhofer Institut fiir Techno- und 
Wirtschaftsmathematik (ITWM), Kaiserslautern, Germany, 2002. In prepara- 
tion. 





Ein dynamisches Verhandlungsmodell des 
Supply Chain Management 



Eric Sucky 

Seminar fur Logistik und Verkehr, Johann Wolfgang Goethe-Universitat, 
Mertonstr. 17, 60054 Frankfurt am Main, esucky@wiwi.uni-frankfurt.de 



1 Einleitung 

Im Fokus von Supply Chain Management (SCM) steht die untemehmensubergrei- 
fende Koordination der Material- und Informationsfliisse liber den gesamten 
Wertschopfungsprozess, von der Rohstoffgewinnung fiber die einzelnen Verede- 
lungsstufen bis hin zum Endkunden (vgl. [8], S. 8). Die Koordination in der 
Supply Chain (SC) kann generell nach dem hierarchischen oder heterarchischen 
Prinzip erfolgen: Wahrend das hierarchische Koordinationsprinzip dadurch ge- 
kennzeichnet ist, dass eine iibergeordnete Planungsebene Rahmenpl^e entwirft, 
die untergeordneten Planungsebenen als Vorgaben dienen, treffen nach dem hete- 
rarchischen Prinzip die Akteure in der SC ihre Entscheidungen durch gegenseitige 
Ubereinkunft (vgl. [10], S. 5). Das hierarchische Koordinationsprinzip ist im 
Rahmen von SCM tetisch zu beurteilen: ,Most supply chains are composed of 
independent agents with individual preferences. These agents could be distinct 
firms or they could even be managers within a single firm. (...) It is also reason- 
able to assume that each agent will attempt to optimize his own preference, know- 
ing that all of the other agents will do the same.“ ([2], S. 1 13). Im Rahmen einer 
untemehmensubergreifenden Koordination der Wertschopfungsprozesse sind Pla- 
nungsprobleme mit interpersonellen Mehrfachzielen zu 15sen. Die Koordination 
erfolgt i. d. R. nach dem heterarchischen Prinzip, d. h., es erfolgt eine dezentrale 
Abstimmung der Akteure, die ihre interdependenten Entscheidungen durch unmit- 
telbare Interaktion im Rahmen von Verhandlungen treffen (vgl. [3], S. 55-59). 

Gegenstand dieses Beitrags ist das Planungsproblem der Bestimmung integrier- 
ter Bestell- und Produktionspolitiken in der SC. Aus der Verknupfung der durch 
die Bestell- und Produktionspolitiken der SC-Partner festgelegten Beschaffungs- 
und Produktionsprozesse resultiert der Materialfluss durch die gesamte SC. 
W^end produktionswirtschaftliche und logistische Ansatze zur Planung integ- 
rierter Materialfliisse durch eine weitgehende Vemachlassigung von Wettbe- 
werbsaspekten und der Dominanz hierarchischer Koordinationsprinzipien gekenn- 
zeichnet sind, eignen sich spieltheoretische Ansatze, die strategische Interaktion 
zwischen Untemehmen zu analysieren und eine dezentral abgestimmte Planung zu 
unterstiitzen (vgl. [6], S. 307). Ziel des Beitrags ist die Entwicklung eines Ver- 
handlungsmodells zur Bestimmung integrierter Bestell- und Produktionspolitiken. 




115 



2 Das Verhandlungsproblem 

GemaB eines Rahmenvertrags zwischen einem Abnehmer (A) und einem Herstel- 
ler (P) wild der bekannte Periodenbedarf b [ME/PE] durch Lieferungen der Quan- 
titat X [ME] gedeckt. Unter dem Ziel der Minimierung ihrer relevanten Kosten je 
Periode bestimmen (A) und (P) ihre individuell optimale Bestellmenge bzw. Los- 
groBe auf Basis der klassischen LosgroBenformel. (P) disponiert seine Produkti- 
onspolitik dabei auf einer Lot-for-Lot-Basis, d. h., die Liefermenge entspricht der 
LosgroBe. Die Bestellmenge von (A) entspricht ebenfalls der Liefermenge, sodass 
die Bestellmenge von (A) gleich der LosgroBe von (P) sein muss (vgl. [9], S. 965). 
Die streng konvexen Kostenfunktionen von (A) und (P) lauten: 

, X b X , ^ ^ ^ b X b , ... 

ZA(x) = B - + - hA ; Zp(x) = R - + - - hp. (1) 

X 2 X 2 d 

b : Periodenbedarf von (A) [ME/PE] d : Produktionsrate von (P) [ME/PE] 

R : Riistkosten von (P) [GE] Lagerkosten von (P) [GE/ME PE] 

B : Bestellkosten von (A) [GE] : Lagerkosten von (A) [GE/ME PE] 

Fiir beide ZielgroBen gilt ,Minimierung“ als Auspragung der Hohenpraferenz- 
relation. Das Planungsproblem der Ermittlung der integrierten Bestell- und Pro- 
duktionspolitik kann als Vektorminimumproblem formuliert werden: 




Existiert eine Bestell- und Produktionspolitik x , fur die beide Zielfunktionen 
in (1) ihre individuellen Minima erreichen, so wird diese Losung als perfekte Lo- 
sung von (2) bzw. als perfekte Bestell- und Produktionspolitik bezeichnet. Die in- 
dividuell optimalen Bestell- und Produktionspolitiken von (A) und (P) lauten: 




Es existiert nur dann eine perfekte Losung von (2), wenn die individuell opti- 
male Bestellmenge und die individuell optimale LosgroBe identisch sind. In die- 

sem Fall gilt x = x^ = Xp und es ergibt sich der ideale Zielflinktionswertvektor: 

z' = (za(x),Zp(x)) = 

In realen Planungssituationen wird die individuell optimale Bestellmenge nicht 
der individuell optimalen LosgroBe entsprechen. Im Allgemeinen gilt daher 

x^ Xp und (2) besitzt keine perfekte Losung. Die Akteure miissen eine Kom- 

promisslosung finden, d. h. sie miissen sich im Rahmen von Verhandlungen iiber 
die zu realisierende Bestell- und Produktionspolitik einigen. Kandidaten fiir eine 
Verhandlimgslosung sind Bestell- und Produktionspolitiken, die beziiglich der 





116 



Zielfunktionen in (1) fiinktional-effizient sind. Die Menge der funktional- 
effizienten Bestell- und Produktionspolitiken ist: 

=[c€r^|x€[x;,x;]}. (5) 

Fiir jede Bestell- und Produktionspolitik x € kann mindestens eine Be- 
stell- und Produktionspolitik x g angegeben werden, fiir die (A) und (P) 
niedrigere Kosten realisieren. Nicht funktional-effiziente Altemativen sind keine 
potenziellen integrierten Bestell- und Produktionspolitiken, ihre Wahl verstofit ge- 

gen den Grundsatz der formalen Rationalitat (vgl. [7], S. 134). Fiir x^ ^ Xp exis- 
tiert ein Zielkonflikt, da ausgehend von z. B. der individuell optimalen Bestell- 
menge von (A) eine Bestell- und Produktionspolitik x, mit und x e X®^ , 

zu hoheren Kosten bei (A) fiihrt, wahrend (P) niedrigere Kosten realisiert. Es gilt: 

z^(xp)>z^(x)>z^(x^) und Zp(x'^)>Zp(x)>Zp(Xp)VxG]x*^,Xp[ (6) 

Durch die Ermittlung von (5) erfolgt eine Komplexitatsreduktion des dem Vek- 
torminimumproblem (2) zu Grunde liegenden Planungsproblems. Eine endgiiltige 
Auswahl der integrierten Bestell- und Produktionspolitik gelingt aber nicht. Es 
existiert der Interessengegensatz, dass (A) und (P) ihre jeweils individuell optima- 
len Losungen realisieren mochten. Verhandlungsprobleme mit der Moglichkeit 
der SchlieBung bindender Vertrage sind Gegenstand der kooperativen Spieltheo- 
rie, deren Losungskonzepte in axiomatische Ansatze und Ansatze zur Modellie- 
rung konkreter Verhandlungsprozesse klassifiziert werden konnen (vgl. [5], S. 24- 
25). Charakteristisch fur axiomatische Ansatze ist, dass sie Verhandlungen nicht 
als zeitliche Abfolge von Geboten der Akteure abbilden, sondem Verhandlungs- 
probleme dadurch zu losen versuchen, dass sie den Akteuren eine Losung anbie- 
ten, die als „fair“ zu beurteilen ist. (vgl. [1], S. 150). Im Rahmen dieses Beitrags 
wird der Verhandlungsprozess von (A) und (P) jedoch explizit modelliert. 



3 Die Verhandlungsidsung 

Es wird ein Verhandlungsmodell entwickelt, mit dem es gelingt, wechselseitige 
Angebote und Konzessionen abzubilden und die integrierte Bestell- und Produkti- 
onspolitik als Ergebnis eines dynamischen Verhandlungsprozesses zu ermitteln. 
Das Ergebnis von Verhandlungen ist abhangig von den Machtbeziehungen zwi- 
schen den Verhandlungsparteien. „A firm’s power reflects its potential for influ- 
ence on the decision making and behavior of another firm.“ ([4], S. 58). Es kann 
zwischen asymmetrischer und symmetrischer Machtverteilung unterschieden wer- 
den. Im Rahmen der hier zu Grunde liegenden Problemstellung bedeutet eine 
asymmetrische Machtverteilung, dass sich ein Akteur in der Machtposition befin- 
det, seine individuell optimale Losung als integrierte Bestell- und Produktionspoli- 
tik zu realisieren. Im Weiteren soil von einer symmetrischen Machtverteilung aus- 
gegangen werden, sodass kein Akteur seine individuell optimale Losung gegen 
den Willen des anderen realisieren kann. 

Im Rahmen des zu analysierenden Verhandlungsprozesses haben beide Akteure 
die Moglichkeit zu den Zeitpunkten t = 0,1,2,..., T ein Angebot zu unterbreiten. 





117 



Dabei bezeichnet x| die vom Akteur i€ {A,P} zum Zeitpunkt t angebotene Be- 
stell- und Produktionspolitik. Es ist z(xj)' = (zJ^(x|),Zp(x|)) der aus dem Ange- 
bot x| des Akteurs ie {A,P} zum Zeitpunkt t resultierende Zielfunktionswert- 

vektor, mit zj^(x|) und Zp(x|) als die entsprechenden Zielfunktionswerte von 
(A) und (P). Die Akteure verhalten sich individuell rational im Sinne ihrer Zielset- 
zungen. Ein Angebot x| ist individuell rational, wenn gilt: 

zj^(x)^)<zj^"‘(xp"^) bzw. Zp(Xp)<Zp~^(xJ^”’) fur t = l,2,...,T. (7) 

Ein Angebot eines Akteurs i e {A,P} heisst Konzession, wenn gilt: 

Zp^^(x^/*) < Zp(xJ^) bzw. z^^^(x|7^) < zj^(xp) fur t = 0,1,2,. ..,T - 1 . (8) 

Eine Konzession heisst voile Konzession wenn ein Akteur ein Angebot annimmt: 
x^^^ =Xp bzw. xJ7^ =x\ furt = 0,1,2. ..,T-1. (9) 

Eine Konzession ist partiell, wenn sie (8) erfullt aber keine voile Konzession 
ist. Zur Analyse des Verhandlungsprozesses wird angenommen, dass beide Akteu- 
re zu jedem Verhandlungszeitpunkt t ein Angebot abgeben. Auch wird angenom- 
men, dass beide Akteure ein starkes Interesse an einer Einigung haben, d. h. ein 
Scheitems der Verhandlungen induziert bei beiden Akteuren hohere Kosten als die 
Annahme des ungiinstigsten Angebots. Verhalten sich die Akteure individuell ra- 
tional im Sinne ihrer Zielsetzungen, so bieten sie zum Zeitpunkt t = 0 ihre jeweils 
individuell optimalen Losungen als Bestell- und Produktionspolitik an. Es folgt 
x^ = x^ und Xp = Xp . Im Fall von x^ ^ Xp ergibt sich keine Einigung und zum 
Zeitpunkt t = 1 miissen weitere Angebote abgegeben werden. Zum Zeitpunkt 
t = 1 hat beispielsweise (A) folgende Altemativen: 

(A) wiederholt sein Angebot: =x\ =x\, 

(A) macht eine voile Konzession: x\ = Xp = Xp oder 
(A) macht eine partielle Konzession: x\ ^Xp = Xp, 

mit Zp(x^)<z^(x^) und zJ^CxJ,) < z^(x^) . 

Die Wiederholung des Angebots aus Zeitpunkt t = 0 birgt die Gefahr, dass (P) 
die Verhandlungen aufgrund mangelndem Entgegenkommens abbricht, sie somit 
scheitem. Eine voile Konzession wiederum ist nicht individuell rational, da fur al- 

le XG [x^,Xp[ auch z^(x) < z^(x*) gilt. Einzige individuell rationale Alterna- 
tive fur (A) zum Zeitpunkt t = 1 ist die Abgabe einer partiellen Konzession. Ana- 
loges gilt fiir (P). Zum Zeitpunkt t = 1 geben somit beide Akteure eine individuell 
rationale, partielle Konzession ab, welche die Bedingungen (7) und (8) erfiillen. 
Zur Abgabe dieser Konzessionen benotigen die Akteure keine Informationen fiber 
die Kostenfimktionen des anderen, d. h. der Verlauf des Verhandlungsprozesses 
ist unabhangig vom Informationsstand der Beteiligten. Nur auf Basis der Angebo- 
te zum Zeitpunkt t = 0 konnen partielle, individuell rationale Konzessionen ab- 
gegeben werden. Zum Zeitpunkt t = 1 ergeben sich folgende Konzessionen: 





118 



XAe]x“,x®[ = ]x*,x*p[ ; xl,e]x®,x®[ = ]x^,x;[ . 
Fiir Konzessionen zu den Folgezeitpunkten ergibt sich analog: 
x^ e]x^'',x‘p''[ , x'p e]xA',x;r'[ furt=2,3,...,T. 



( 10 ) 

( 11 ) 



Die Verhandlungen enden, wenn ein Akteur eine voile Konzession macht. Zur 
Ermittlung der idealtypischen Verhandlimgslosung muss bestimmt werden, wie 
lange Konzessionen moglich sind. Fiir eine mogliche Konzessionen von (A) gilt: 

(^a(^a )“^a(^p))’(^p(^a )'‘^p(^a))^(^a(^a)~^a(^p))*(^(^a)“^(^a))' 

Analoges gilt fiir (P). Die Verhandlungen enden, wenn keine Konzessionen 
mehr moglich sind, die Bedingung (12) erfullen. Dies gilt fiir die Bestell- und 

Produktionspolitik x€]x^,Xp[, die z=(z^(x)-z^(xp))(zp(x)-zp(x^)) maximiert. 
Der idealtypische Verhandlungsprozess wird an einem Beispiel illustriert. 



4 Ein Beispiel 



Die dem Beispiel zu Grunde liegenden relevanten Daten zeigt Tabelle 1. 



Tabelle 1. Daten des Beispiels 



Abnehmer (A) 


Hersteller (P) 


b=10000 [ME/PE 


d=16000 [ME/PE] 


B=100 [GE] 


R=240 [GE] 


h =50 fGE/ME PEI 


h„=48 [GE/ME PE] 



Die Akteure wissen nicht , ob x^ = Xp , x^ > Xp oder x^ < Xp gilt. Die ein- 
zigen individuell rationalen Angebote zum Zeitpunkt t = 0 sind die individuell 
optimalen Losungen: x^ =x^ =200 und Xp =Xp =400. Die Angebote sind 
nicht kompatibel, sodass beide Akteure zum Zeitpunkt t = 1 eine partielle, indivi- 
duell rationale Konzession tatigen, mit xj^,x{> e]200,400[ , z. B. xj^ =250 und 
Xp =350. Die Verhandlungen werden solange fortgesetzt, bis einer der Akteure 

ein Angebot annimmt oder bis keine partiellen Konzessionen mehr moglich sind. 
In Tabelle 2 ist der mogliche Verhandlungsverlauf bis zur idealtypischen 
Verhandlungslosung x = 282,84 dargestellt. 



Tabelle 2. Verlauf des Verhandlungsprozesses 



t 


x’a 


Za(Xa) 


Zp(xk) 


z 


Xp 


Za(Xp) 


Zp(Xp) 


z 


0 


200 


10000 


15000 


0 


400 


12500 


12000 


0 


1 


250 


10250 


13350 


3712500 


350 


11607,1 


12107,1 


2582919 


2 


270 


10453,7 


12938,9 


4217669,9 


300 


10833,3 


12500 


4166675 


3 


280 


10571,4 


12771,4 


4298077,9 


290 


10698,3 


12625,9 


4277535,5 


T 


282,84 


10606,6 


12727,9 


4301994,1 


282,84 


10606,6 


12727,9 


4301994,1 







119 



5 Schlussbetrachtung 

In einer SC agieren rechtlich und wirtschaftlich selbst^dige Untemehmen, sodass 
sich eine Koordination nach dem heterarchischen Prinzip ergibt. Unter der An- 
nahme einer symmetrischen Machtverteilung konnte der Verhandlungsprozess 
liber die zu realisierende Bestell- und Produktionspolitik modelliert und die Ver- 
handlungslosung ermittelt werden. Der dargestellte Verhandlungsprozess fiihrt zu 
einer Win-Win-Situation, da beide Akteure, gegeniiber dem aus ihrer Sicht un- 
giinstigsten Einigungsfall, niedrigere Kosten realisieren. Die Instrumente und Me- 
dioden der Spieltheorie stellen geeignete Ansatze zur Analyse der Interaktion der 
SC-Partner sowie zur Unterstiitzung dezentraler Planungsansatze in der SC dar. 



Literatur 

[1] Beminghaus SK, Erhart KM, Giith W (2002) Strategische Spiele - Eine Einfiihrung in 

die Spieltheorie. Berlin Heidelberg New York 

[2] Cachon GP (1999) Competitive Supply Chain Inventory Management. In: Tayur S, 

Ganeshan R, Magazine M (eds) Quantitative Models For Supply Chain Manage- 
ment. Boston Dordrecht London, S 1 1 1-145 

[3] Corsten H, Gossinger R (2001) Einfiihrung in das Supply Chain Management. 

Miinchen Wien 

[4] Frazier GL, Spekman RE, O’Neal CR (1988) Just-In-Time Exchange Relationships in 

Industrial Markets. Journal of Marketing 52: S 52-67 

[5] Holler MJ, Illing G (2000) Einfiihrung in die Spieltheorie. Berlin Heidelberg New 

York 

[6] Inderfiirth K, Minner S (2001) Produktion und Logistik. In: Jost PJ (Hrsg) Die Spiel- 

theorie in der Betriebswirtschaftslehre. Stuttgart, S 307-349 

[7] Isermann H (1974) Lineare Vektoroptimierung. Dissertation, Regensburg 

[8] Scholz-Reiter B, Jakobza J (1999) Supply Chain Management - Uberblick und Kon- 

zeption. HMD Praxis der Wirtschaftsinformatik 207: S 7-15 

[9] Toporowski W (1999) Untemehmensiibergreifende Optimierung der Bestellpolitik - 

Das JELS-Modell mit einem Intermediar. ZfbF 1 0: S 963-989 

[10] Zapfel G (2000) Supply Chain Management. In: Baumgarten H, Wiendahl HP, Zentes 

J (Hrsg) Logistik-Management. Berlin Heidelberg New York, S 1-32 




Modeling the Interaction between Operational 
and Financial Decisions in the Inventory Pooling 
of Repairable Spare Parts Problem 



Hartanto Wong, Dirk Cattrysse, Dirk Van Oudheusden 

Centre for Industrial Management, Katholieke Universiteit Leuven 
Celestijnenlaan 300 A, B-3001 Leuven, Belgium 
E-mail: Hartanto.Wong@cib.kuleuven.ac.be 



1 Introduction 

Equipment-intensive industries such as airlines, nuclear power plants, various pro- 
cess and manufacturing plants using complex machineries are often confronted 
with the difficult task of maintaining a high system performance while controlling 
their inventory holding cost. An important type of components in such industries 
are called repairable items. The typical problem is to determine the optimal stock- 
ing level of spare parts. An insufficient stock of spare parts can lead to an exces- 
sive downtime cost. On the other hand, maintaining an excessive number of spare 
parts increases the cost of tying up capital in non-revenue-generating spare parts 
inventories. 

Inventory pooling, an ‘inter-company’ cooperation where the cooperating com- 
panies share their inventories, is an effective way to improve a company’s logisti- 
cal performance without requiring any additional cost. Cooperation usually takes 
the form of lateral transshipments from a location with a surplus of on-hand inven- 
tory to a location that faces a stock-out. 

Previous research in the area of inventory pooling for non-repairable items [1, 
6, 7, 9, 15, 16] and for repairable items [4, 8, 14, 18] has mainly focused on coop- 
eration among ‘bases’ which are assumed to be owned by a single ‘parent-firm’ 
whose objective is to maximize some measures of aggregate performance. Little 
research has been done in the context where the pooling members are independent 
companies and hence, there exists a competitive behavior in the sense that all 
members have interests in maximizing their own benefits from pooling. To model 
the competitive behavior in a decision making process, classical optimization con- 
cepts may no longer be applicable. Instead, game theoretic models should be used. 
Several studies [9, 10, 12, 17] have been done to model the inventory problem us- 
ing game theory. They consider a competitive version of the newsboy problem. 
Anupindi et al. [2, 3] develop a “co-opetitive” framework for the analysis of de- 
centralized distribution systems. They consider the sequential decisions of inven- 
tory ordering and allocation of revenues/costs. For the cooperative allocation deci- 
sion, they use the concept of core and develop sufficient conditions for the 




121 



existence of the core. They develop conditions for the existence of a pure strategy 
Nash Equilibrium for the inventory decision. 

This paper is in the spirit of the work of Anupindi et al. Two sets of decisions 
are made when dealing with inventory pooling of repairable spare parts. The first 
decisions are the operational decisions dealing with: optimal number of spare 
parts; and critical inventory level for allowing a lateral transshipment. In relation 
to the critical inventory level, we have complete and partial pooling when the 
critical inventory level is zero and positive, respectively. The second decision is 
the financial decision dealing with allocation of pooling costs/benefits to every 
pooling member. In this paper, our objective is to give an analysis on the interac- 
tion between those two decisions. 



2 Problem description 

Without loss of generality, we will use the airline company’s repairable spare 
parts inventory problem to describe our problem. We use an example case of two 
cooperating airline companies to describe our analysis. Below are all the parame- 
ters for both companies. 

• number of airplanes operated: Mj = 15; = 10; 

• failure rate of the repairable item: X = 0.002/day; 

• average repair time: 1/// = 50 days; 

• inventory holding cost: 10000 $/unit/year; 

• AOG (Aircraft On Ground) unit cost: gi = g 2 = 1000 $/grounded-airplane/day; 

• transshipment cost: t = 3000 S/transshipment. 



3 Operational and financial decisions 



3.1 Operational decisions 

In this section, we consider the operational decisions without considering the allo- 
cation of cost or profit to each pooling member. In this setting, we treat all mem- 
bers as ‘bases’ and cooperation among them takes place within a single company. 

The problem can be formulated as follows. Each company wants to find the 
number of spare parts, S., and the critical inventory level, k., to minimize the total 
system cost TC (see Eq. 1). In this equation, AOG. is the expected number of 
AOG at company i while Tr. is the expected number of lateral transshipments re- 
ceived by company i. 

TC = '^lj,S,+g,AOG,+tTr^ (1) 

In this paper, the machine-repair queuing model is chosen to solve the problem. 
The more detailed description of the model is given in [18]. After investigating a 
number of values of S. and k., we obtain the optimal solutions: k=0’, S=3; 





122 



^2=0; with the total cost TC = 72210 $. Without inventory pooling, the optimal so- 
lutions for the two companies are: S=4 and 52=3, and the resulted costs are Cj = 
48390 $ and C 2 = 38075 $. A significant cost saving (14255 $) is achieved when 
the two companies pool their spare parts inventories. 



3.2 Financial decision 

The main motivating factor behind the members’ desire to cooperate is the po- 
tential cost savings for all pooling members. Thus, it is imperative that the allo- 
cated cost is lower than the cost that each member can guarantee itself by acting 
individually. Otherwise, the members will have no incentive to participate in this 
cooperation. The concept of the “core” from cooperative game theory is used to 
find “fair” cost allocation [5]. Let us define ^ as the cost allocated to company i. 
Using the optimal result from the previous section, the core in our example case 
can be simply defined by: X, + ^ 2 = 72210; Xj < 48390; and X 2 < 38075. Determin- 
ing which point to choose within the core has been the subject of considerable 
study within the game theory and cost accounting literature [11]. No attempt is 
made here to find the “best” cost allocation method. 

At a glimpse, one can see that the operational and financial decisions are 
made sequentially and they can be treated separately. First, optimal operational 
decisions are determined and secondly the resulted cost is allocated. In fact, this 
treatment is only valid in the situation where no competitive behavior is consid- 
ered. The existence of competitive behavior of the pooling members needs a dif- 
ferent view of the decision processes. For any agreement on financial decision, the 
process of making operational decisions involves strategic interaction, i.e., each 
pooling member reacts optimally to the other’s decisions. In the next section, we 
discuss the interaction between the two decisions. For this purpose, four cost allo- 
cation methods are considered. 



4 The interaction 

We will now look at the Nash equlibria of the games in our example. In the first 
part, all costs and parameters are treated as common knowledge in the system. 
Consequently, this competitive situation can be considered as a two-player game 
with complete information. In the second part, we extend our analysis by consider- 
ing the game with incomplete information. All costs and parameters, except for 
AOG unit cost g., remain as common knowledge. 



4.1 Games with complete information 

In this part we consider four games corresponding to the four selected allocation 
methods. The description of each allocation method is given below. 

• Cost allocation method 1: inventory holding cost is allocated based on the 
number of spare parts provided by each company; AOG cost is allocated based 
on individual AOG; transshipment cost is paid by the “receiver” company. 




123 



• Cost allocation method 2: inventory holding cost and transshipment cost are al- 
located based on the number of airplanes operated by each company; AOG cost 
is allocated based on individual AOG. 

• Cost allocation method 3: the total cost is allocated based on the demand rate of 
each company (the number of airplanes operated by each company). 

• Cost allocation method 4: Shapley value (the formula to calculate Shapley 
value can be seen in [11]). 

In each game, the set of operational decisions and (^ 2 ,^: 2 )* is a Nash equi- 
librium if {S^,k^y is the best response for company 1 to ( 52 ,^ 2 )* 
best response for company 2 to In the first and second games, however, 

the Nash equilibria are not optimal as in the other two games. The optimal solu- 
tions for the system are *S=3; k=0] S~3\ k=0 which are the optimal decisions 
when the cooperation is treated as if occuring inside a single company. Cost allo- 
cation methods 1 and 2 do not give enough incentives for both companies to coop- 
erate. The Nash equilibria in these two games are S=4\ ^j=4; *S2=3; A:2=3. That 
means no pooling applied in these equilibria. The presence of individual payments 
of some cost components in the first two methods causes the two companies get- 
ting incentives to lower their individual costs by unilaterally deviating from any 
cooperative solutions. These examples clearly show that the selection of opera- 
tional decisions is influenced by the agreement on cost allocation (financial deci- 
sion). 



4.2 Games with incomplete information 

Since many individual parameters are more likely to be private rather than com- 
mon knowledge, cost allocation methods based on the total joint cost (like me- 
thods 3 and 4 in our example) are not easy to implement. To make this type of cost 
allocation methods applicable, all pooling members should be willing to open all 
their private information. This is not easy especially when the pooling members 
are competitors to each other. Moreover, the possibility of making profits by pro- 
viding untrue information makes the problem more complicated. In this part, we 
analyze how the game will be played when not all cost parameters are opened as 
public information. We choose AOG unit cost as private information because 
among the three components, AOG unit cost is the most difficult to quantify and 
can be very subjective. We assume that the other two cost components, i.e., inven- 
tory holding and transshipment costs are still traceable. We use an example case 
where AOG unit costs of both companies are private information and their true 
values are g =^2=2000. Let us define g/ as the submitted value of AOG unit cost 
by company i. To simplify the problem, we assume that g/ can only have a value 
which is a multiple of 500 from a minimum of 1000 and up to a maximum of 
3000. 

In this example, cost allocation method 4 (Shapley value) is applied. The game 
is played as follows. First, company 1 and 2 submit the values of g/ and g 2 ’. 
Based on these values, they make operational decisions to minimize their individ- 
ual costs. Note that we assume the following settlement is used to conclude the fi- 
nal cost allocation. First, company i claims its payment as: 



TCq=^h.S^-\-g/AOG.+tTr. . 



( 2 ) 





124 



The Shapley value method gives the cost allocation X' for company i. Thus, to 
reach the final value X.\ company i has to pay his partner if X' is higher 

than rCC. Each company will finally, of course, consider its actual cost, X., as the 
basis of its decisions. Eq. 3 is used to calculate X.. 

X. = + giAOG, + tTr^ + 2T, TCC, (3) 

It can be shown that the Nash equilibrium of the considered game is g,’=1000; 
g2’=1000; with the operational decisions 5 =3; k=0; S=3; k=0. The resulted costs 
are 45454 $ and 33716 $ for company 1 and 2, respectively. This gives the total 
cost of 79170 $ which is 2286 $ higher than the optimal total cost if both compa- 
nies reveal the true values of their AOG unit costs (the optimal operational deci- 
sions for g^=g=2000: S=4; k=0; S=3; k=0; total cost = 76884 $). This example 
case shows that given the condition of incomplete information, there is an incen- 
tive for each pooling member to reveal the untrue information as the optimal reac- 
tion to the other member’s revelation. 



5 Conclusions 

In this paper, we present the use of game theoretic models to analyze the decision 
processes in the problem of inventory pooling of repairable spare parts. The analy- 
sis considers both cooperative and competitive behavior of the pooling members 
in making their operational and financial decisions. We show the interaction be- 
tween the two decisions. The financial decision on cost allocation gives the influ- 
ence on whether or not the pooling members have any incentives to cooperate by 
sharing their spare parts inventories. We also extend our analysis to the situation 
with incomplete information. The AOG unit cost is assumed to be private rather 
than common knowledge. We show that there is an incentive for each pooling 
member to reveal the untrue information as the optimal reaction to the other mem- 
ber’s revelation and this can lead to the pooling members playing the game with a 
non-optimal solution. 

Two related issues are of interest for further research. The first issue is to ex- 
tend the problem to the games with more than two players. The second issue that 
could be of importance is to find a general framework for the design of cost allo- 
cation policies such that all pooling members are motivated to fully cooperate and 
reveal the true information ending up with the equilibrium which gives the optimal 
solution. 



References 

1 . Alfredsson P, Verrijdt J (1999) Modeling emergency supply flexibility in a two-echelon 
inventory system. Management Science 45: 1416-1431 

2. Anupindi R, Bassok Y, Zemel E (1999) Study of decentralized distribution systems: 
part I - general framework. Working paper, Kellog Graduate School of Management, 
Northwestern University, Evanston, IL 





125 



3. Anupindi R, Bassok Y, Zemel E (1999) Study of decentralized distribution systems: 
part II - applications. Working paper, Kellog Graduate School of Management, North- 
western University, Evanston, IL 

4. Axsater S (1990) Modelling emergency lateral transshipments in inventory systems. 
Management Science 36 : 1329-1338 

5. Gerchak Y, Gupta D (1991) On apportioning costs to customers in centralized continu- 
ous review inventory systems. Journal of operations Management 10: 546-551 

6. Grahovac J, Chakravarty A (2001) Sharing and lateral transshipment of inventory in a 
supply chain with expensive low-demand items. Management Science 47: 579-594 

7. Kukreja A, Schmidt CP, Miller DM (2001) Stocking decisions for low-usage items in a 
multilocation inventory system. Management Science 47: 1371-1383 

8. Lee HL (1987) A multi-echelon inventory model for repairable items with emergency 
lateral transshipments. Management Science 33: 1302-1316 

9. Lippman SA, McCardle KF (1997) The competitive newsboy. Operations Research 45: 
54-65 

10. Parlar M (1988) Game theoretic analysis of the substitutable product inventory problem 
with random demands. Naval Research Logistics 35: 397-409 

11. Robinson LW (1993) A comment on Gerchak and Gupta’s “On apportioning costs to 
customers in centralized continuous review inventory systems”. Journal of Operations 
Management 11: 99-102 

12. Rudi N, Kapur S, Pyke DF (2001) A two-location inventory model with transshipment 
and local decision making. Management Science 47: 1668-1680 

13. Sherbrooke CC (1968) METRIC: A multi-echelon technique for recoverable item con- 
trol. Operations Research 16: 122-141 

14. Sherbrooke CC (1992) Multi-echelon inventory systems with lateral supply. Naval Re- 
search Logistics 39: 29-40 

15. Tagaras G (1999) Pooling in multi-location periodic inventory distribution systems. 
Omega International Journal of Management Science 27: 39-59 

16. Tagaras G, Cohen MA (1992) Pooling in two-location inventory systems with non- 
negligible replenishment lead times. Management Science 38: 1067-1083 

17. Wang Q, Parlar M (1994) A three-person game theory model arising in stochastic in- 
ventory control theory. European Journal of Operational Research 76: 83-97 

18. Yanagi S, Sasaki M (1992) An approximation method for the problem of a repairable- 
item inventory system with lateral supply. IMA Journal of Mathematics Applied in 
Business and Industry 3: 305-314. 




Die Schatzung von Markentreue, Nichtkauferan- 
teil und Marktpotenzial aus Handeispaneldaten 



Heribert Reisinger, Udo Wagner, Matthias Schuster 

Institut fur Betriebswirtschaftslehre, Universitat Wien, Brunner StraBe 72, 1210 
Wien, e-mail: {heribert.reisinger | udo.wagner | matthias.schuster}@univie.ac.at 



1 Problemstellung 

Fine Kemaufgabe des Marketings ist die Identifizierung der relevanten Kunden- 
bedurfnisse und die dementsprechende, spezifische Ausrichtung des Angebots. 
Haufig ist es dabei nicht moglich, den gesamten Markt in derselben Weise zu be- 
arbeiten und so bedient man sich geeigneter Segmentierungstechniken. Fine wich- 
tige Art der Segmentierung auf Konsurngiiterm^kten bezieht sich auf verhaltens- 
bezogene Variablen und hier spielen wiederum Kaufhaufigkeit sowie Markentreue 
eine wesentliche Rolle. Die dafur benotigten Informationen liefert der Marktfor- 
scher in der Regel auf der Grundlage von Haushaltspaneldaten oder von eigens da- 
ffir durchgefuhrten Prim^erhebungen. Die vorliegende Arbeit beschaftigt sich mit 
der Frmittlung der Anteile an loyalen Kunden, Markenwechslem und Nichtkau- 
fem, beschreitet zu deren Bestimmung aber einen anderen Weg, indem sie vor- 
schlagt, aus Handeispaneldaten, also nichtpersonenbezogenen Angaben, individu- 
elles Nachfrageverhalten abzuleiten. Der Vorteil einer solchen Datenquelle liegt 
darin, dass sie regelm^ig (zumeist mit Hilfe der modemen Scannertechnologie) 
erhoben, insbesondere fiir Logistikzwecke eingesetzt wird und daher haufig ohne 
Zusatzkosten zur Verfugung steht. 

Die Grundideen der zur Diskussion gestellten Methodik wurden mehreren For- 
schungsgebieten entlehnt. Zunachst stammen aus dem Marketing etablierte Fr- 
kenntnisse iiber RegelmaBigkeiten zwischen Marktanteilen, Markentreue und 
Markenwechsel fur haufig gekaufte Giiter des taglichen Bedarfs. Diese Muster 
fanden Beriicksichtigung in verschiedenen Modellen (insbesondere Fhrenberg 
1988; Colombo, Morrison 1989), die ihrerseits hier als Ausgangspunkt dienen. 
Weiters liefert die dkologische Inferenz (King 1997) das Prinzip, dass Kennzahlen, 
die sich auf Querschnitte von Personen beziehen und durch Aggregation individu- 
eller Daten gewonnen wurden, tendenziell noch immer die charakteristischen 
Symptome der Mitglieder dieser Gruppen widerspiegeln. Solche Merkmale kon- 
nen daher mittels einer statistischen Analyse erfasst werden. Das bekannteste Bei- 
spiel hierfur sind vermutlich W^lerstromanalysen auf der Grundlage der Frgeb- 
nisse von zwei aufeinander folgenden Wahlg^gen in einer groBen Anzahl von 
Wahlbezirken oder -sprengeln. SchlieBlich kommt aus der Okonometrie die ,Ma- 
ximum Fntropie Schatzmethode* (Jaynes 1957), die vorteilhafter Weise vor allem 
dann eingesetzt wird, wenn iiber einen Sachverhalt nur unvollstandiges Wissen 




128 



vorhanden ist und der Untersuchende von subjektiven Einfliissen moglichst unbe- 
riihrte Ergebnisse erhalten mochte. Eine derartige Schatzmethode scheint fur die 
vorliegende Aufgabenstellung besonders zweckmaBig zu sein, weil es sich dabei 
einerseits um eine Art von Disaggregation handelt, andererseits iiber Nichtkaufer 
iiberhaupt keine unmittelbar aus den Daten ableitbare Informationen vorhanden 
sind. 



2 Losungsansatz 

2.1 Modellierung von Markentreue und Wechselkaufen 

Den vorstehenden Ausfuhrungen gemafi postulieren wir in diesem Abschnitt zu- 
nachst gewisse Zusammenhange zwischen den Anteilen der loyalen Kunden und 
der Markenwechsler, wobei wir unsere Betrachtungen auf die zwei aufeinander 
folgenden Perioden t und (/+1) konzentrieren. 



II 

+ 


(1) 




(2) 



Mit: 

/? I . bedingte Wahrscheinlichkeit zu Marke j zu wechseln, nachdem 

zuvor Marke i gekauft wurde 

Pj^ bedingte Wahrscheinlichkeit Marke j wieder zu kaufen 

nij^ Marktanteil von Marke j zum Zeitpunkt t 

p. von Marke j abh^giger Parameter; 0 < /? < 1 

ij Markenindex, {ij = l,. *rO 

/ Anzahl der beriicksichtigten Marken 

t Zeitindex 

Die inhaltliche Begriindung ffir diesen Ansatz liefert die Erkenntnis, dass auf 
vielen Konsurngiiterm^kten die Anteile der zwischen zwei Marken wechselnden 
Kaufer proportional zu dem Produkt ihrer Marktanteile schwanken. Beschrankt 
man sein Augenmerk auf die Kunden einer Marke z, so erhalt man (1), beriicksich- 
tigt man femer die Summenbedingung (logische Konsistenz) fiir die so erhalt 
man (2). Der Parameter Pj hat unter dem Namen , Switching Constant* Eingang in 
die Literatur gefunden und wird dort ausfuhrlich diskutiert (fur eine diesbezugli- 
che Darstellung in einem sehr ahniichem Kontext siehe Wagner et al. 2001). An- 
schaulich gesprochen beschreibt pj das AusmaB der Fluktuation am betrachteten 
Markt: je kleiner pj ist, desto hoher ist der Anteil der loyalen Konsumenten von 
Marke j\ ein Wert nahe Eins hingegen steht fur ausgepragtes Wechselwahlverhal- 
ten. In einem formal identischen Modell fur Investitionsgiiter bzw. Dienstleistun- 
gen unterscheiden Colombo, Morrison (1989) zwischen jedenfalls treuen Kunden 
(,hard"Core loyal*) und potenziellen Wechselkaufem; (1 - p) beschreibt dann in 
ihrem Szenario den Anteil ersterer. 





129 



2.2 Parameterschatzung auf Basis von Querschnittsdaten 

Wie in der Problemstellung dargelegt stehen fur die Schatzung der Parameter 
querschnittsbezogene Informationen, in diesem Falle die Absatze und fur 
alle Marken 7 , in k = Querschnitten (z.B. Geschaften, geographischen Be- 
zirken) in zwei aufeinander folgenden Perioden t und (/+1) zur Verfugung. Dem 
Prinzip der okologischen Inferenz gemaB postulieren wir fiber die Querschnitte 
konstante Parameter und formulieren unser Schatzproblem als Mehrgleichungs- 
regression. 

Wi) 

Zur Bestimmung der Parameter ersetzen wir die durch (1) bzw. (2). Wir 
wahlen ein verallgemeinertes ,Kleinste Quadrate* Kriterium, wobei wir als Ge- 
wichte die reziproken QuerschnittsgroBen verwenden und erhalten ein Optimie- 
rungsproblem unter Nebenbedingungen, die die logische Konsistenz der Schatz- 
werte sicherstellen (fur die formalen Details siehe Wagner et al. 2001). 



2.3 Das Problem der Nichtkaufer 

Die Identifikation der fiir unseren Ansatz zentralen RegelmaBigkeit (1) und (2) er- 
folgte durch die Untersuchung individueller Kaufgeschichten. Das heiBt, man ana- 
lysierte das Markenwahlverhalten bei aufeinander folgenden Kaufakten, wobei die 
Zeitspannen dazwischen unterschiedliche Langen haben konnen. Stehen nun Han- 
delspaneldaten zur Verfugung, so ist der Beobachtungsrhythmus vorab festgelegt 
(etwa wochentlich oder monatlich) und individuelle Verbrauchszyklen bleiben da- 
her unberucksichtigt. Bei einer ausreichend groBen Stichprobe kann argumentiert 
werden, dass in den beiden Perioden zwar nicht die Absatze derselben Haushalte 
aufgezeichnet werden, sich aber in Summe jene, die entweder nur im ersten oder 
nur im zweiten Beobachtungszeitraum nachfragen, sowohl beziiglich des Mengen- 
als auch beziiglich des Markenwahlverhaltens die Waage halten (,left censoring‘ 
und ,right censoring* Probleme gleichen einander aus). Dies setzt allerdings im 
Wesentlichen stationare Verhaltnisse und damit einen gleichbleibenden Einsatz 
der Marketinganstrengungen der Anbieter voraus. Insbesondere kurzfristig wir- 
kende Aktionen an den Einkaufsstatten konnen aber sehr wohl einen Einfluss auf 
die Wahl des Einkaufszeitpunktes haben (man denke beispielsweise an Vorzieh- 
kaufe auf Grund von Preisnachlassen). Das bereits erwahnte, zu (3) formal sehr 
Mmliche, Modell der Wahlerstromanalyse braucht sich zwar nicht mit der Proble- 
matik der Zwischeneinkaufszeiten zu befassen - die Wahltermine sind ja festge- 
legt - beriicksichtigt in manchen Versionen aber sehr wohl die ,Nichtwahler*. 

Wir erweitem daher unseren Ansatz und fiihren einen zusatzlichen, mit (/+!) 
bezeichneten, Zustand ,Nichtkauf ein. Sowohl der Wertebereich der Laufindizes i 
und j als auch der Summationsbereich in (3) wird entsprechend vergroBert. Aus 
Sicht des Kaufverhaltens bedeutet dies, dass zwischen der Marken- und der Ein- 
kaufszeitpunktswahl modellmaBig kein Unterschied gemacht wird. Es w^e daher 
mit dem Ansatz unvereinbar, wenn Kaufer, die eine gewisse Marke praferieren, 
haufiger oder seltener als andere nachfragen. Dies ist natiirlich eine einschranken- 
de Annahme, sie wird jedoch iiblicherweise in der Literatur getroffen und auch 
hier vor allem aus pragmatischen Grunden postuliert. 





130 



W^end die Beriicksichtigung der Nichtkaufer im Modell vergleichsweise ein- 
fach durchzufuhren ist, stellt dies fur den Dateninput ein fundamentales Problem 
dar. Zum einen sind die und Absatze, sondem die 

Anzahlen der Nichtkaufer in den einzelnen Querschnitten, zum anderen liegen 
dariiber keine Angaben in Handelspaneldaten vor. Der erste Aspekt wird dadurch 
relativiert, dass auch bei der Ableitung der Regelm^igkeiten (1) und (2) das 
Mengenwahlverhalten auBer Acht blieb, welches sich aber sehr wohl in den Ab- 
satzen aus dem Handelspanel niederschlagt. Allerdings zeigen Forschungsergeb- 
nisse (bspw. Ehrenberg 1988), dass solche Unterschiede auf individueller Ebene 
durch Aggregation weitgehend ausgeglichen werden. Aus diesem Grund scheint 
eine derartige Vereinfachimg - ebenfalls aus pragmatischen Griinden - gerechtfer- 
tigt und dann ist auch der Unterschied zwischen der Anzahl an Kaufakten (anstelle 
von Absatzen) und jener an Nichtkaufem nicht mehr gravierend. Der zweite Ge- 
sichtspunkt erfordert allerdings, die und zu schatzen. 

Wir schlagen dafiir zwei unterschiedliche Vorgangsweisen vor, wobei wir zu- 
nachst jeweils annehmen, dass das Marktpotenzial Q bzw. die Gesamtzahl aller 
Kaufer am untersuchten Markt bekannt und iiber die beiden Beobachtungsperio- 
den konstant ist. Letzteres durfte im Lichte der betrachteten Zeitraume unproble- 
matisch sein. Eine einfache Moglichkeit stellt dann die proportionale Aufteilung 
von Q auf die einzelnen Querschnitte ihrer GroBe gem^ dar (analog fur (/+!)). 

(4) 

Mit: 

^kt = X ^ GroBe des Querschnitts (bezogen auf den Absatz am 

untersuchten Markt) 



Eine andere Variante ergibt sich aus einem Optimierungsproblem, welches je- 
nem aus Abschnitt 2.2 sehr ahnlich ist. 

XL XL - XL^yi-' • J 



Die Zielfunktion ist wieder das verallgemeinerte ,Kleinste Quadrate‘ Kriterium, 
die gesuchten Variablen sind aber die und q^^D^d^ip die Nebenbedingungen 
(6) stellen die logische Konsistenz sicher. (5) setzt voraus, dass eine erste Schat- 
zung der Pj^. vorliegt. Dies kann etwa durch Anwendung des Hendry Modells 
(Vanhonacker 1980) auf aggregierter Ebene, mit f^ alle Marken sowie dem Zu- 
stand Nichtkauf konstantem p, erfolgen. Mittels der so bestimmten und 
^(i+i)k(t+i) erhalt man durch Anwendung von (3) revidierte /? |. bzw. p. Unterscheiden 
sich die beiden Schatzungen fiir die p sehr stark voneinander, sollten die obigen 
zwei Rechenschritte iterativ wiederholt werden, bis Konvergenz eintritt. 



2.4 Die Bestimmung des Marktpotenzials 

Im vorangehenden Abschnitt wurde angenommen, dass das Marktpotenzial Q be- 
kannt ist. Dies ist fiir die Anwendung des Ansatzes bei der W^lerstromanalyse 





131 



erfiillt, weil die Anzahl der Wahlberechtigten amtlich festgelegt wird. Im Marke- 
ting ist die Sachlage schwieriger, da Q nur in Ausnahmefallen bekannt sein wird 
und ein Handelspanel keine unmittelbar verwertbare Information dariiber bereit- 
stellt. Wir schlagen daher einen Maximum Entropie Ansatz vor. 

(7) wird dabei als eindimensionale Funktion in Q aufgefasst. Ein wesentliches 
Argument fur die Wahl dieser Methode wurde bereits im Abschnitt 1 gegeben. Ein 
anderes resultiert daraus, dass damit ein Kompromiss zu erwarten ist zwischen der 
generellen Tendenz von Maximum Entropie Schatzungen annahemd gleichverteil- 
te Wahrscheinlichkeiten zu liefem und der Modellformulierung (1) imd (2), die in 
der Regel hauptdiagonal-dominante Ubergangsmatrizen erzeugt. 

Zusammenfassend lautet demnach die Vorgangsweise zur Losung des darge- 
legten Problems folgendermafien: 

1) Fur gegebenes Q bestimmt man zunachst die Gesamtzahl der Nichtkaufer 
in / und (/+1). 

2) Auf aggregierter Ebene berechnet man eine Naherung von p mit Hilfe des 
Hendry Modells. 

3) (4) Oder (5) und (6) liefem Schatzwerte fur die Anzahl der Nichtkaufer je 
Querschnitt. 

4) Das mit (3) korrespondierende quadratische Optimiemngsproblem ergibt 
verbesserte Schatzwerte /?. 

5) Wenn erforderlich werden die Schritte 3) und 4) iteriert. 

6) Der Funktionswert fur 7/ - vgl. (7) - wird berechnet, Q vermdert und die 
Schritte 1) bis 5) werden wiederholt. 

7) Jene Parameter werden letztlich gewahlt, fur die H maximal wird. 



3 Zusammenfassung und Ausbiick 

In diesem Beitrag haben wir eine Vorgangsweise dargelegt, mit deren Hilfe die 
Anteile loyaler Kunden, an Wechsel- und Nichtkaufem sowie das Marktpotenzial 
aus Handelspaneldaten geschatzt werden konnen. Ein derartiger Algorithmus 
wurde unseres Wissens nach noch nicht in der Literatur vorgestellt. In einem wei- 
teren Schritt muss freilich seine Leistungsfahigkeit uberpriift werden. Dabei stoBt 
man allerdings auf eine zusatzliche, der Problemformuliemng inh^ente, Schwie- 
rigkeit. Eine stringente Validierung an Hand empirischer Daten ist deswegen nicht 
moglich, weil ein Handelspanel eben definitionsgemaB keine personenbezogenen 
Informationen liefert und ein Haushaltspanel - iiber den selben Markt - mit erste- 
rem nicht vollstandig kompatibel ist (vgl. bspw. Nenning et al. 1979). Die Autoren 
haben dennoch einige empirische Untersuchungen durchgefuhrt, die durchaus 
vielversprechend verlaufen sind. 

Das Instmmentarium der Simulation bietet hier Abhilfe. Zunachst werden dabei 
individuelle Kaufakte generiert, die interessierenden Kennzahlen (Markentreue 
etc.) berechnet und die Aggregation auf Querschnittsebene vorgenommen. Die so 
erhaltenen Daten dienen als Input fur den vorgeschlagenen Algorithmus und die 
geschatzten Parameter konnen mit den ,wahren‘ Werten verglichen werden. Das 
Design des Simulationsexperiments sollte iiberdies die potenziell wichtigsten Ein- 





132 



flussfaktoren auf die Leistungsfahigkeit des Modells (wie Anzahl der Nichtkaufer, 
der Marken und der Querschnitte, die Stichprobengrofie etc.) beriicksichtigen. 
Auch hier liefem erste Berechnungen zufriedenstellende Ergebnisse. 



Literatur 

Colombo RA, Morrison DG (1989) A Brand Switching Model with Implications for Mar- 
keting Strategies. Marketing Science 8(1), 89-99 

Ehrenberg ASC (1988) Repeat Buying: Facts, Theory and Applications. Charles Griffin & 
Company LTD, London 

Jaynes ET (1957) Information Theory and Statistical Mechanics. Physical Review 106, 
620-630 

King (1997) A Solution to the Ecological Inference Problem: Reconstructing Individual 
Behavior from Aggregate Data. Princeton University Press, Princeton 

Nenning M, Topritzhofer E, Wagner U (1979) Zur Kompatibilitat altemativer kommerziell 
verfiigbarer Datenquellen fur die Marktreaktionsmodellierung: Die Verwendung von 
Prewhitening-Filtem und Kreuzspektralanalyse sowie ihre Konsequenzen fur die Ana- 
lyse betriebswirtschaftlicher Daten. Zeitschrift fur Betriebswirtschaft 49 (9), 281-297 

Vanhonacker WR (1980) The Hendry partitioning methodology: Theory and applications. 
Research Working Paper No. 28 8 A, Graduate School of Business, Columbia Univer- 
sity 

Wagner U, Reisinger H, Gausterer K (2001) Die Bestimmung des Marken wechselverhal- 
tens mit Hilfe von Querschnittsdaten. Zeitschrift fur Betriebswirtschaft 71 (10), 1113- 
1130 





A Conjugate Direction Frank- Wolfe Method 
with Applications to the Traffic Assignment 
Problem 



Maria Daneva and Per Olov Lindberg 

Linkoping University, Department of Mathematics, SE-581 83 Linkoping, Sweden, 
e-mails: madan, polin@mai.liu.se 



Abstract. We present a version of the Prank- Wolfe method for linearly constrained 
convex programs, in which consecutive search direction are made conjugate to each 
other. We also present preliminary computational studies in a MATLAB environ- 
ment. In these we apply the pure Prank- Wolfe, the Conjugate Direction Prank- Wolfe 
(CDPW) and the ’’partanized” Prank- Wolfe to some classical Traffic Assignment 
Problems. CDPW compares favorably to the other methods in this study. 



1 Introduction 

The Frank- Wolfe (FW) method [4] was originally suggested for quadratic 
programming problems, but the original paper also noted that it could be 
applied to linearly constrained convex programs. The main usage of the FW 
method has been in routing problems in the telecom and traffic areas, where 
it is usually attributed to [5] and [8], respectively; see however [2] also. It still 
is the most popular method in these areas. According to the survey [13], ’’The 
reason for this is twofold. The method easily generates feasible solutions from 
shortest path calculations and enjoys fast convergence in the early iterations” . 
However, to cite [14], ’’The unsatisfactory performance of the Frank- Wolfe 
algorithm, in particular at the vicinity of an optimal solution, was observed 
quite early.” Since this behavior was observed early, many attempts have been 
made to improve upon the FW method, in particular finding better search 
directions, (e.g. [6], [9], [11]). The main proponent of these attempts is the 
PARTAN approach [9], where one gets better directions by taking PARTAN 
steps (e.g. [10], p. 255-257, [16]) between pure Frank- Wolfe steps. An intrinsic 
difficulty with the PARTAN technique in the Frank- Wolfe context is the 
determination of appropriate bounds on the step lengths to stay feasible. 
In the early papers these bounds were determined individually for the first 
few steps. Later, [1] and [3] determined analytical recursion formulas for 
these bounds. In spite of improved convergence characteristics, the PARTAN 
approach has not superseded the Frank- Wolfe method in the telecom and 
transportation areas. 

In the current paper we suggest a simple improvement of the FW direc- 
tions: making them conjugate. Every textbook on nonlinear programming 




134 



advocates conjugate gradients over pure gradient directions (i.e. steepest de- 
scent). PARTAN owns some of its motivation to conjugate directions. In fact, 
applied to an unconstrained convex quadratic problem, PARTAN will gener- 
ate conjugate directions ([10], p.255-257). 

In the next section we review the FW method and its ’’PARTANized” 
version. Section 3 is devoted to conjugate direction techniques and our own 
conjugate direction FW method (CDFW). In Section 4 we apply MATLAB 
versions of pure FW, partanized FW (PFW) and CDFW to a set of classical 
traffic assignment problems. Measuring performance per major iteration, it 
is demonstrated that FW is clearly outperformed by PFW, which in turn is 
clearly outperformed by CDFW. 

2 The Frank- Wolfe Method and Modifications 

Let f : W R he a, twice continuously differentiable convex function and 
let X C be a compact convex polyhedral set. The methods discussed in 
this section apply to the problem of minimizing / over X: 

(P) /*=min/(x). 

By continuity of /, (P) has a solution, unique if / is strictly convex. 

The problem (P) is in principle well-behaved, but in the traffic and tele- 
com applications we aim at, it is a convex cost multicommodity flow prob- 
lem, which typically becomes very large. Thus, one needs to utilize structure. 
The Frank- Wolfe algorithm allows efficient utilization of the multicommodity 
structure of (P). We briefly discuss it in the next section. 



2.1 The Frank- Wolfe Method 

The conventional FW algorithm [4] was introduced for solving general linearly 
constrained quadratic and convex problems. At iteration k, FW approximates 
/ by linearizing at the current point Xjk, giving an affine minorant fktof: 

/fc(x) = /(Xjfe) + V/(Xfe)^(x - Xfc). 

The first step in FW method is the determination of the search direction, 
which is done by minimizing fk over X. This is a linear program 

(LP/t) /; = min/*(x), 

xEX 

the solution of which we denote by x]^^. In the telecom and traffic applica- 
tions, however, (LP^) decomposes into a set of shortest path problems, (e.g. 
[14], p.97). Note that fl is a lower bound, LBDk, to /*, a fact which may be 
used in termination criteria. 





135 



The second step of the FW method is to perform a line search, i.e. a 
one-dimensional minimization of /(x), along the line segment between the 
current solution Xk and the solution The point where this minimum is 
achieved (at least approximately) is chosen as the next iterate Xk-\-i . 

As already mentioned this algorithm is very easy to implement, but due to 
the slow asymptotic convergence, many modifications have been suggested. 
The line search can e.g. be modified in order to take predetermined [15] or 
longer steps [17] or the search direction can be combined with previous ones 
[6], [9], [11]. In other modifications the linear subproblem is modified in order 
to avoid generating extreme point solutions [7]. 



2.2 The PARTAN Technique 

The method of parallel tangents (PARTAN) originates in attempts to avoid 
zig-zagging in gradient based methods for unconstrained optimization ([10], 
p.254, [16]), by performing an extra line search at each iteration. In the 
PARTAN technique for unconstrained optimization, at the iterate Xk, one 
first performs a line search in the gradient direction, giving the intermediate 
point Yk- Then one applies an extra line search from yk in the direction 

=Yk - Xjfc-i, giving the new iterate xa;+i. 

In the partanized FW method, the line search in the gradient direction is 
replaced by one in the direction towards the solution, x^^, of the linearized 
problem. 

3 The Conjugate Direction Frank- Wolfe Method 

In this section we describe our modification of the FW method. In conju- 
gate directions methods (e.g. [10], Ch.8) for unconstrained convex quadratic 
optimization, one performs line searches consecutively in a set of directions, 
di, • • • , dn, mutually conjugate with respect to the hessian H of the objective 
(i.e. fulfilling djHdj = Q for i ^ j). In W the optimum then is identified 
after n line searches ([10], p.241. Expanding Subspace Theorem). 

In conjugate gradient methods, one obtains conjugate directions by ”con- 
jugatizing” the gradient direction with respect to the previous search direc- 
tion; i.e. dk — V/(xfc) H- ffk^k-i^ with (3k chosen so that dk is conjugate to 
djk_i. In the quadratic case, dj^ then in fact becomes conjugate to all previous 
directions di, • • • ,dk~i (e.g. [10], p. 245, Conjugate Gradient Theorem). 

The same trick can be applied to the FW method: Let Xk be the current 
point (see Figure 1), found by a line search in the direction dk-i, from Xjk_i 
towards G X. At Xk we solve the linearized problem (LPjt) giving 

G A. In the FW method the new search direction is taken as 
dfc = Yk — ^k- In the conjugate direction FW method (CDFW), we choose 
the search direction as 



dfc = dfc -f- Pkdk-i, 



( 1 ) 





136 



where Pk is chosen to make dk conjugate to with respect to the hessian 
Hk of / at Xfc, which can be done cheaply, since is diagonal. Equivalently 
(up to scaling) we choose dj^ = - xa;, where w^; is a convex combination 

of Yk and w^-i, both belonging to X. 




Fig. 1. Determination of the search direction for the CDFW algorithm 



4 Computational Experiments on Traffic Assignment 
Problems 

As mentioned, the main areas of applications of the Frank- Wolfe method and 
its modifications are traffic assignment (e.g.[14]) and telecom traffic routing 
(e.g.[13]). In this paper we consider the traffic assignment problem (TAP). 
This problem can be formulated as a mathematical program with nonlinear 
objective and linear constraints ([14], Ch.2). Thus we can use the methods 
proposed in this paper to solve the TAP. 

The three methods discussed have been implemented in MATLAB [12] 
and run on a 480 MHz Sun Sparc 10 Station. Two networks, frequently used 
to test methods for problems in transportation planning, are used in our 
experiments (see Table 1). 

In the experiments we start in an all-or-nothing assignment using free 
flow costs. The solver for the shortest path subproblems is implemented in 
C, using a MEX interface for communication between MATLAB and C. In the 
line search, we take a single Newton step per line search, computed cheaply 
by the diagonality of Hk. In figure 2 we display the relative errors of all 



Table 1. Test networks. 



Network 


nodes arcs commodities 


Sioux Falls 
Winnipeg 


24 76 

1052 2836 


528 

4345 




137 



three methods (in log-log scale diagrams) against the iteration count, for the 
Sioux Falls and Winnipeg cases. The relative error REk is defined as REk = 




Fig. 2. Comparison between the FW, PFW and CDFW . 



(/(xfc) — LED)/ LED ^ where LED is the best known lower bound (which 
in the Sioux Falls case is the full precision optimal value). The iterations 
are continued much longer than what is typically done in practice, to show 
the asymptotic behavior, which is linear with slope 1 in the log-log diagram, 
corresponding to a relative error, REk^ of the form 

REk = const. jk. (2) 

The seemingly small difference between PFW and CDFW is illusive. Dis- 




Fig. 3. The ratio between the relative error for CDFW and PFW 
playing the ratio of the relative errors of PFW and CDFW in the Sioux Falls 





138 



and Winnipeg cases (figure 3), we see that the ratio stabilizes rather soon 
around 2 and 1.5 respectively. In view of (2), this implies that PFW asymp- 
totically needs to take around 100% and 50% more iterations respectively to 
achieve the same accuracy as CDFW. In the same way comparing CDFW 
with FW, the needed number of iterations of FW is around ten times that of 
CDFW. 

References 

1. Arezki, Y. and van Vliet, D. (1990) A full analytical implementation of the 
PARTAN/Frank-Wolfe algorithm for equilibrium assignment. Transportation 
Sci. 24(1), 58-62. 

2. Bruynooghe, M., Gibert, A. and Sakarovitch, M. (1969) Une methode 
d’affectation du trafic. Proceedings of the 4th International Symposium on the 
Theory of Road Traffic Flow, Bundesminister fiir Verkehr, Bonn, 198-204. 

3. Florian, M., Guelat, J. and Spiess, H. (1987) An efficient implementation of 
the ’’PART AN” variant of the linear approximation method for the network 
equilibrium problem. Networks 17, 319-339. 

4. Prank, M. and Wolfe, P. (1956) An algorithm for quadratic programming. Naval 
Res. Logist. Quart. 3, 95-110. 

5. Fratta, L., Gerla, M. and Kleinrock, L. (1973) The flow deviation method: 
An approach to store- and- forward communication network design. Networks 3, 
97-133. 

6. Fukushima, M. (1984) A modified Frank- Wolfe algorithm for solving the traffic 
assignment problem. Transportation Res. Part B 18(2), 169-177. 

7. Larsson, T., Patriksson, M. and Rydergren, C. (1997) Applications of simplicial 
decomposition with nonlinear column generation to nonlinear network flows. 
Network optimization, 346-373. Springer, Berlin. 

8. LeBlanc, L. J. (1973) Mathematical programming algorithms for large scale 
network equilibrium and network design problems. PhD thesis, lE/MS Dept, 
Northwestern University, Evanston IL. 

9. LeBlanc, L., Helgason, R. and Boyce, D. (1985) Improved efficiency of the 
Frank- Wolfe algorithm for convex network programs. Transportation Sci. 19(4), 
445-462. 

10. Luenberger, D. G. (1984) Linear and Nonlinear Programming. Addison- Wesley, 
Reading, MA. 

11. Lupi, M. (1986) Convergence of the Frank-Wolfe algorithm in transportation 
network. Civil Engineering Systems 3, 7-15. 

12. Matlab Reference Guide (1996), The MathWorks, Mass. 

13. Ouorou, A., Mahey, P. and Vial, J.P. (2000) A survey of algorithms for convex 
multicommodity flow problems. Management Sci. 46(1), 126-147. 

14. Patriksson, M. (1994) The Traffic Assignment Problem - Models and Methods. 
VSP, Utrecht, The Netherlands. 

15. Powell, W. and Sheffi, Y. (1982) The convergence of equilibrium algorithms 
with predetermined step sizes. Transportation Sci. 16(1), 45-55. 

16. Shah, B., Buehler, R. and Kempthorne, O. (1964) Some algorithms for mini- 
mizing a function of several variables. J. Soc. Indust. Appl. Math. 12, 74-92. 

17. Weintraub, A., Ortiz, C. and Gonzalez, J. (1985) Accelerating convergence of 
the Frank-Wolfe algorithm. Transportation Res. Part B 19(2), 113-122. 





Online- Algorithmus zur Steuerung 
von Verkehrslichtsignalanlagen 



Klaus Ladner^ 

Institut fiir Statistik und Operations Research, Karl-Pranzens Universitat Graz, 
Universitatsstrafie 15, 8010 Graz, Osterreich 



Abstract. Es wird ein Algorithmus entworfen, der laufend verkehrsabhangige 
Daten einer Kreuzung und strategische Vorgaben des Verkehrsrechners verarbeitet, 
um den jeweils nachsten Umlauf einer Ampelanlage zu planen. 

Dabei werden die Vorstellungen des Verkehrsplaners beriicksichtigt. Es konnen 
Zeitfenster definiert werden, innerhalb denen Verkehrsstrome entweder keinesfalls 
Griin haben diirfen oder unbedingt Griin haben miissen, um die Koordinierung mit 
den umliegenden Kreuzungen ermoglichen zu konnen. 

Es werden Kombinationen von Verkehrsstromen der betrachteten Kreuzung so 
zu Phasen zusammengefasst, dass dabei sinnvolle Phasenwechselschemata erstellt 
werden konnen. Aufgrund einer definierten Zielfunktion und mehreren Nebenbe- 
dingungen werden zulassige Kombinationen von Phasen gesucht, welche die ”Leis- 
tungsfahigkeit” der Steuerung maximieren. Dazu wird das Problem in einen 
Zustands-/Schichtengraph iibergefiihrt, in dem ein langster zulassiger Weg gesucht 
wird. 

Da der Algorithmus auf Kreuzungsrechnern laufen soil, muss er extrem schnell 
und mit bescheidenem Speicheraufwand gute Ergebnisse liefern konnen. 



1 Einleitung 

Die Steuerung von Verkehrslichtsignalanlagen ist eine komplexe Aufgabe, die 
versucht wird, modular zu losen. In diesem Beitrag soil die Aufgabe eines 
dieser Module dargelegt und ein Losungsansatz, aufbauend auf das Modul, 
das in [3] beschrieben wird, skizziert werden. 

Die Thematik der Lichtsignalsteuerung wurde schon in einigen wis- 
senschaftlichen Arbeiten untersucht (siehe z. B. [4] oder [1]). Riedel fasst 
in [2] die Steuerung als Regelkreis auf und versucht die Problematik sogar 
als Art Dynamische Programmierung optimal zu losen. 

Betrachtet wird hier eine Situation, in der ein Kreuzungsrechner mit 
einem Verkehrsrechner verbunden ist. Der Verkehrsrechner liefert Informatio- 
nen die Koordinierung der Verkehrslichtsignalanlage mit umliegenden Anla- 
gen betreffend. Der Kreuzungsrechner wertet standig aktuelle Verkehrsdaten 
aus. Diese Verkehrsdaten bestehen z. B. aus Meldungen liber herannahende 
bffentliche Verkehrsmittel, die bevorzugt liber die Kreuzung geflihrt werden 
sollen. Problematisch dabei ist, dass es zu Zielkonflikten kommen kann, wenn 
sowohl eine Koordinierung aufrecht erhalten werden soil als auch auf das 
tatsachlich vorhandene Verkehrsgeschehen eingegangen werden soil. 




140 



Gesucht ist eine Grobplanung fiir die Ein- und Ausschaltzeitpunkte der 
einzelnen Verkehrsstrome fiir den nachsten Umlauf unter Beriicksichtigung 
dieser Informationen und weiteren notwendigen weiter unten genannter 
Nebenbedingungen. Unter Grobplanung wird hier verstanden, dass die 
ungefahren Zeitpunkte bestimmt werden, die exakte sekundengenaue Festle- 
gung, sowie die Optimierung der Phaseniibergange ist an dieser Stelle nicht 
notig, da diese Aufgabe von einem anderen Modul erledigt wird. 

2 Aufgabenstellung 

Es gebe n Verkehrsstrome und eine Umlaufzeit (Tu). Jeder Verkehrsstrom 
habe eine maximale Wartezeit eine minimale Griinzeit 

eine minimale Rotzeit und eine maximale Griinzeit 

Verkehrsstrome sind entweder vertraglich oder nicht vertraglich. Fiir je zwei 
nicht vertragliche Verkehrsstrome i und j gebe es eine Zwischenzeit die 
angibt, wie viele Sekunden nach dem Ende des Griins von i vergehen miissen, 
bevor j Griin bekommen darf. 

Fiir jeden Verkehrsstrom kann ein Griinband definiert sein, innerhalb 
dessen dieser Verkehrsstrom jedenfalls Griin haben muss (Griinband- Anfang 
BAi bzw. Griinband-Ende BE{). 

Gegeben sei weiter fiir jeden Verkehrsstrom eine Funktion die 

in Abhangigkeit von der Zeit angibt, welchen Beitrag der betrachtete 
Verkehrsstrom zum Zielfunktionswert leistet. In diese Funktionen fliefien alle 
Vorstellungen und Gewichtungen des Betreibers der Verkehrslichtsignalan- 
lage beziiglich der Wartezeit und Anzahl der Halte eines Verkehrsstromes, der 
Prior it at der Verkehrsstrome untereinander, sowie statistischer Verteilungen 
der Verkehrsfliisse ein. 

Gesucht sind Ein- und Ausschaltzeitpunkte (em^ bzw. ausi) fiir alle 
Verkehrsstrome derart, dass die Summe ihrer Flachen maximal wird. Verein- 
facht um die Problematik des Umbruchs am Ende der Umlaufzeit kann das 
Problem wie folgt formuliert werden: 




bez. eirii — aus\ < 

aus\ - eirii > tg^^^ 
eirii — ausi > 
auSi - eirii < tg^^^ 
eirii — ausj > tzyi 
eirii < Tu 
auSi < Tu 
ausi^ eirii > 0 





141 



3 Bilder 

Ein Verkehrsstrom kann entweder ein- oder ausgeschaltet sein. Es gibt somit 

theoretische Zustande, in der sich die Verkehrslichtsignalanlage befinden 
kann. Tatsachlich sind es weniger, da nicht vertragliche Verkehrsstrome nicht 
gleichzeitig eingeschaltet sein konnen. 

Da in der Praxis aber nicht alle Zustande Verwendung finden werden, 
wird ein Algor it hmus entwickelt, der eine Menge von ” ‘sinnvollen” ’ Zustanden 
ermittelt. Der Zustand einer Verkehrslichtsignalanlage kann durch die Menge 
der eingeschalteten Verkehrsstrome beschrieben werden. Eine solche Menge 
wird als Bild bezeichnet. 

Als erstes werden Maximalbilder gesucht. Maximalbilder seien Bilder, 
die hinsichtlich der Eigenschaft, dass es nicht moglich ist, einen weiteren 
Verkehrsstrom hinzuzufiigen, maximal sind. 

Gesucht sind Maximalbilder und Untermengen von diesen, sodass ausge- 
hend von einem beliebigen Maximalbild durch Kombination von Untermen- 
gen anderer Maximalbilder jeder Verkehrsstrom genau einmal Griin hat. 

Gegeben sei die Menge aller Verkehrsstrome = {VS\ | z = 1 . . . n}. R 
sei eine Vertraglichkeitsrelation. Ein Maximalbild sei die Menge B C fiir 
die gilt: V ij mit V5i, VS^ e B : VS; R V5j und Vi mit VSi i B, mit 
V5j gB: V5i~nRV5j. 

Die Anzahl der Maximalbilder sei v. Ein Maximalbild werde mit B[ fiir 
i G {1, . . . , u} bezeichnet. Die Restmengen seien: B^ = B^\B{ i = 1 ...v. 

Das A:-te Maximalbild zweiter Ebene, ausgehend von der i-ten Restmenge, 
sei die Menge Rik C B\ fiir die gilt: V j , r mit V5j, V5r G Bik : VS-^KV Sr 
und \/j mit V5j G Bi\Bik, 3r mit V5r G Bik : V -•R VSr- 

Diese Zerlegung wird analog solange durchgefiihrt bis die sich ergebenden 
Restmengen zulassig sind, d. h. Maximalbilder sind und daher nicht weiter 
zerlegt werden konnen. 

Die Anzahl der so erhaltenen Bilder sei m. Jedes Bild wird zu jenen Zeit- 
punkten als gesperrt markiert, zu denen ein Verkehrsstrom, den das Bild 
enthalt gesperrt ist oder ein Verkehrsstrom, der nicht im Bild enthalten ist, 
ein Griinband besitzt. 

4 Zustands- / Schichtengraph 

Die Zeitachse werde diskretisiert. Es werden Intervalle mit der Lange A 
gebildet. Dadurch entsteht ein Zustands- /Schichtengraph mit m Zustanden 
und k Schichten, wobei fc = 1 + int • Theoretisch konnte es somit m • k 

Knoten geben. Tatsachlich fallt die Anzahl der Knoten geringer aus, da an 
jenen Stellen im Zustands-/Schichtengraph keine Knoten vergeben werden, 
wo es im zugeordneten Bild im entsprechenden Zeitintervall Sperrbereiche 
gibt. 

Zwischen je zwei Knoten i und j gibt es genau dann einen Pfeil von i 
nach j, wenn es moglich ist, zu jenem Zeitpunkt, der dem Ende des Knotens 





142 



i zugeordnet 1st, in das Bild des Knotens j zu wechseln, sodass aufgrund eines 
potentiellen Zwischenzeitenlaufs zu jenem Zeitpunkt das Bild des Knotens j 
eingeschaltet werden kann, der dem Anfang des Knotens j entspricht. 

Es wild eine Quelle eingefiihrt, die mit alien Knoten der ersten Schicht, 
und eine Senke, die mit alien Knoten der letzten Schicht adjazent ist. 

Jeder Knoten erhalt eine Bewertung, namlich die Summe der Integrale 
iiber die zugeordnete Zeit der eingeschalteten Verkehrsstrome, 

Gesucht ist ein zulassiger Pfad durch diesen Graph, der die Summe der 
Knotenbewertungen maximiert. Da die Rechenzeit fiir eine vollstandige Enu- 
meration nicht zur Verfiigung steht, wird auf ein heuristisches Verfahren 
zuriickgegriffen, das den Rahmen dieses Beitrags sprengen wiirde (siehe dazu 
auch Fig. 1). 




Fig. 1. Zustands-/Schichtengraph mit Losungsvektor 



5 Resultat 

Der Losungsvektor wird anschliefiend dahingehend transformiert, dass je- 
dem Verkehrsstrom jene Griinzeiten zugeordnet werden, die den Zeitinter- 
vallen der Knoten im Losungsvektor entsprechen, in denen der Verkehrsstrom 
vorkommt (siehe dazu auch Fig. 2). 






143 




Fig. 2. Plan mit transformierten Griinzeiten 



References 

1. Mirchandani, P., Knyazyan, A., Head, L., Wu, W. (2001) An Approach Towards 
the Integration of Bus Priority, Traffic Adaptive Signal Control, and Bus Infor- 
mation/Scheduling Systems. In: Vo6, S., Daduna, J. R. (ed.): Computer-Aided 
Scheduling of Public Transport, Springer, Heidelberg, 319-334 

2. Riedel, Th. (1994) Regelung in Verkehrssystemen, Zrich 

3. Rudolf Keller AG (1993) VS-PLUS Beschreibung, Muttenz 

4. Schnabel, W., Lohse, D. (1980) Grundlagen der Strafienverkehrstechnik und der 
Strafienverkehrsplanung Band 1, Verlag fiir Bauwesen, Berlin 




Optimal Sorting Machine Allocation in the Postal 
Distribution Network 



Jaroslav JandCek 

Department of Transportation Networks, Faculty of Management and Informatics, 
University of Zilina, Slovak Republic, 010 26, e-mail: iardo@frdsa.fri.utc.sk 



1. Introduction 

The postal distribution system constitutes many-to-many distribution system with 
several transshipments [1], [3]. At the beginning, consignments are transported by 
vehicles from post offices to transit centers. Another fleet continuous the transport 
from transit centers. This one finishes at so-called sorting center, which serve a 
given cluster of the transit centers. Rough sorting of the consignments coming 
from the individual transit centers of the cluster is done in the sorting center to 
separate the consignments according to destination sorting centers and each class 
is divided according to the destination transit centers. Having completed the rough 
sorting, each bulk of consignments is transported to the destination-sorting center. 
Fine sorting is performed there to separate the consignments in accordance to des- 
tination post offices. The fine sorting is completed after all bulks have been sorted. 
Then the sorted consignments are loaded on vehicles and transported to destina- 
tion transit centers to be there up to the time, when distribution to the destination 
post offices starts. It can be noticed that the earliest time of consignment bulk de- 
parture from its original transit center results from given times of departures from 
the transit centers. The similar situation occurs at the opposite end of the transport 
chain. The arrival time of a bulk at the destination transit center is fixed for distri- 
bution fleet to be able to complete its work in time. It follows that the time of bulk 
departure from destination sorting center is given as well. 

This way, each pair of origin and destination sorting centers is associated with 
pair of times, where the first time is the earliest possible end of rough sorting at 
the original sorting center and the second one gives the latest possible time, when 
fine sorting at the destination center must start. Any pair of times constitutes time 
window, in which transport of the associated bulk must be performed. Necessary 
transport time for any pair of sorting centers is fixed, but the end and starting 
times of sorting depend on numbers of rough and fine sorting machines respec- 
tively. The higher number of rough sorting machines, the earlier end time of rough 
sorting is reached and similarly the higher number of fine sorting machines, the 
latter starting time of fine sorting is allowed. The objective of the problem is to 
minimize the total cost of allocated rough and fine sorting machines under the 
above-mentioned constraints. 

In the following chapters we give the mathematical description of the time con- 
straints, establish a mathematical programming model of the problem and outline 
a special exact approach to solve the problem. 




145 



2. Time constraints of sorting machine aliocation problem 

As mentioned in the introduction, the end time of rough sorting process is influ- 
enced by arrival times of bulks to the original sorting center. Let be all 

the arrival times ordered in increasing way. Let Pj, be sizes of the bulks 

associated with the arrival times. If r denotes time necessary for one rough sorting 
machine to process one consignment and w denotes number of rough sorting ma- 
chines allocated at the sorting center, then w/r is rate of sorting at the center. The 
end of rough sorting can be determined as resulting value T of this algorithm [2]: 
Step 1. Set r = tj-^p/r/w) and j = 1. 

Step2. Repeat j T = max {T, p/r/w) until j=q. 

demonstrate this algorithm, we consider an example with three arrival times tj, 
t 2 and ts, which are equal to 450, 470 and 500 respectively and which are associ- 
ated with numbers 6000, 6000 and 3600 of consignments. For r = 0.01 and w = 2, 
we get current number of unsorted consignments as saw-shaped function of time 
(see Fig. 1) and T = 528. 




Fig. 1 

When another example is considered with the same tj, pj and r, but with differ- 
ent number of machines w = i, we get function of unsorted consignments in Fig 2, 
where T= 512. 




It can be noticed that the end of rough sorting is given by arrival time t(w) fol- 
lowing the last idle interval of the sorting process and by the number of consign- 
ments p(w) delivered after this time. Then resulting time T^(w) takes value of ex- 
pression t(w)+p(w)r. Let us imagine for a moment that w increases continuously. 
Until other idle time emerges after former t(w),p(w) stays constant and T^(w) will 
decrease like hyperbola. Whenever new idle time after former t(w) occurs, new 
t(w) takes higher value and associated p(w) shrinks. Then J^(w) decreases with 





146 



hyperbola again but it decreases slower. We conclude that J^(w) is decreasing 
convex function of w. 

Similar way we analyze the starting time J^(y^ of fine sorting at a destination 
sorting center. If we denote t(\^ the last departure before the first idle interval of 
the sorting process and the number of consignments, which is to be taken 
fi-om the sorting center up to t(\^, then resulting time takes value of t(y^ - 
r(w)l , when r denotes time of one fine sorting machine necessary for processing 
of one consignment. Thus function J^(w) is concave increasing function of w. 

Now, let us consider, for each sorting center s € S, the above mentioned func- 
tions tJ^(Ws) and which provide the earliest end of rough sorting and the 

latest starting time of fine sorting for integer values of Ws and vi^. 

Let v/, v^^+m(s) denote lower and upper bounds on number of rough sorting 
machines, which is possible to allocate at sorting center s and let v/, v^+p(s) de- 
note lower and upper bounds on number of fine sorting machines. Then variables 
w/ and w/ may be introduced by equations w/ = - v/ and m/ = w, - v/. Func- 

tions can be obtained by substitution for and into T^^(wJ and T^(w.)\ 
= T^(^s +0 and + u^) for w/ e <0, m(s)> and w/ e <0, 

p(s)>. Furthermore, let trs denote time necessary to transport sorted consignments 
from original sorting center r to destination sorting center s. After this preliminar- 
ies the time constraints of the problem can be designed as follows: 

R,(u^) + M r,seS ( 1 ) 

m/ € {0, 1, ..., m(s)} for, s € S (2) 

w/ € {0, 1, p(s)} for, s € S (3) 

These time constraints determine set of feasible solutions of convex nonlinear 
discrete problem. Let us denote and costs of one rough and fine sorting ma- 
chines respectively. Then the whole model can be stated as: 

Minimize ). 

s&S 

Subject to (1), (2) and (3). 



3. Integer linear program of machine aiiocation probiem 

We make use of the fact that LP-relaxation of model (4), (1), (2), (3) is convex 
and separable and replace it by piece wise linear model. Let us define the follow- 
ing constants for each s e S: 

= Rs(k-1) - Rs(k), dsk = for k= 1,2, m(s), 

K = Fs(t) - F/t-1), e,, = // K f 0 Tt = l,2, p(s). 

Due the fact that Rs is a convex decreasing function, inequalities asi> 
...>asm(s)>0 and therefore 0<dsj<ds2<ds3< ... <dsm(s)’ must hold. Considering that 
Fs is concave increasing function, we get similarly bsi> bs 2 >bss> ->>bsp(s)>0 and 
therefore 0<esj<es2<es3< ... <esp(s)’ Let us introduce two sequences of real vari- 
ables Usk and foissS and k = 1, 2, ..., m(s) and t = 1, 2, ..., p(s). Now, LP- 





147 



relaxation of model (4), (1), (2), (3) can be replaced by the following linear model, 
which corresponds to piece wise linear approximation of the LP-relaxation. 



/ m(s) > 

Minimize^, 

seS ^ ^=1 t=l j 




( 5 ) 


m(r) 

+L^s, >R/0)+t^-F/0) for r,seS, 

k=\ /=1 


( 6 ) 


for s € S and k= 1, ... m(s), 


( 7 ) 


<b^^ for s € S and t = 1, ... p(s), 


( 8 ) 


for s e S and k= 1, ... m(s) and t 


= 1. ...p(s). 


( 9 ) 



The relations between former variables w/, w/ and latter variables Ush are 

m{s) 

given by equations w/ ^sk for s e S and u/= ^ for s € S. 

jt=i t=\ 

The above-mentioned property of coefficients dsk and 6st ensures that optimal 
values of Usk and constitute connected area of real numbers originated at zero. 
Another optimal solution property of (5)-(9) is that whenever all variables Usk and 
take values from sets {0, ask} and {0, bst) respectively, then the optimal solution 
of (5)-(9) constitutes an optimal solution of (4), (1), (2), (3). 

Furthermore, it follows from the introduction of constants b^^ and that 

they can be arbitrary enlarged by multiplication of (5) and (6) by a positive num- 
ber. This enables to approximate the values of the constants by positive integer 
numbers with demanded precision. That is why only integer constants are consid- 
ered in the next sections. 



4. Branch and bound technique 

To obtain an optimal integer solution of the problem (5)-(9) branch and bound 
method with depth-first searching scheme was employed. The branching is per- 
formed by addition one of constraints (a) or (b) to the model, which describes a 
processed node of the searching tree. 

(a) Let = 0 for all q € {k, k+1, m(s)} for given s and k. 

(b) Let for all q € {1, 2, k} for given s and k. 

The same constraints can be constructed for variables and constants b^^,p(s). 
The core of the methods is an algorithm, which finds optimal non-integer solution 
of problem (5)-(9). Objective function value of optimal solution yields lower 
bound of an inspected node of the searching tree and rounding the values of non- 
zero or up to corresponding or b^^ gives feasible integer solution, which 
can be used to update the current best solution. Objective function value of the 
current best solution constitutes upper bound used in the searching scheme. To ob- 
tain the optimal solution of problem (5)-(9), any linear programming algorithm 
may be used, but we developed a special primal-dual algorithm to accelerate com- 
putation. This algorithm comes out of the primal model: 





148 



( hk{s) ht(s) ^ 

Minimize Y, 



V" 



hk{r) kt(s) 

Subject to >c.^ for r, s e S, 

k=lk(r)+l t=lt(s)+l 



for s s S and k = lk(s)+f ..., hk(s), 
fors € S and t = lt(s)+l, ht(s), 



( 10 ) 

( 11 ) 

( 12 ) 

(13) 



v^^>0 for seS and k =lk(s)+l, ..., kt(s) and t = lt(s)+l, ...,ht(s). (14) 

In this model, constants lk(s), hk(s)-^l denote the highest index of variables 
fixed at value a ^ and the lowest index of the variables fixed at 0. Constants lt(s) 
and ht(s)+l have the same meaning for variables . It is considered that lk(s) < 
hk(s) and lt(s) <ht(s) hold. Constant in (1 1) denotes following expression: 

lk{r) lt(s) 

RM + trs-Fs(0)-2^rk-lK 

k=\ t=\ 

Having introduce nonnegative dual variables for constraints (11), for com 
straints (12) and for constraints (13), the dual model can be stated as: 



hk{s) ht(s) (15) 

Maximize^, ^^skysk~^ • 

reS sgS ssS k=lk(s)+\ seS t=lt{s)+\ 

Subject to < d^j^ for rG S, and k = lk(r)+l, ... hk(r), (16) 

seS 

^x^^ - z^^ < e^^ fors s S and t = lt(s)+l, ... ht(s), (17) 

^rs.ysk> ^st-^ forr,seS, k=lk(s)+l, ... hk(s), t = lt(s)+l,...,ht(s). (18) 



This primal-dual system of equations slightly remembers primal-dual model of 
transportation problem. It has lead to the idea to develop an algorithm like Hun- 
garian method. Developed algorithm starts with feasible solution of (10)-(14), 
which ensures all complementary conditions with exception of those, which are 
formed by constraints (16) and by associated variables Then a bipartite graph 
is considered with vertex subsets r e S and s e S and set of arcs, where each arc 
corresponds to pair of vertices r, s, for which associated constraint (1 1) is satisfied 
as equality. On this graph an augmenting path is searched starting from vertex r, 
for which (16) is satisfied by the current solution as inequality. If an augmenting 
path is found, the current solution of (15)-(18) is improved. Otherwise, a structure 
like Hungarian tree arises on the bipartite graph and it is used to improve the cur- 
rent solution of (10-(14). The process is repeated until all complementary condi- 
tions hold. 





149 



5. Conclusions 

The above-described approach to the optimal sorting machine allocation problem 
was implemented as a part of decision supporting system for postal network de- 
sign [1], [2]. This system was used for analysis of current state and possible vari- 
ants of the Czech postal network. During these experiments, which were done on 
the network with 69 possible sorting center locations, the algorithm has proved to 
be safe and fast enough to solve problems of this size. 

Furthermore, I was found that the primal-dual algorithm often produces integer 
solution of the problem unless branching process had to be employed. 



References 

1. Jablonsky J, Lauber J (1999) A Time-Cost Optimization of the National Postal Distri- 
bution Network. Journal of Multi-Criteria Decision Analysis 8; 51-56 

2. Jan^Cek J (1977) Location and Capacity determination of Postal Sorting Centres (in 
Czech). Proceedings if the international conference POST POINT '97, University of 
Zilina, pp 77-81 

3. Lauber J, Jablonsky J (1977) The Modeling of the Postal Distribution Network. Pro- 
ceedings if the international conference MOSIS'97, vol 3, Ostrava, pp 183-188 



Acknowledgement 



This work has been supported by research projects CETRA and VEGA 1/721 1/20. 




A Combined Approach to Solve the Pickup 
and Delivery Selection Problem 



Join Schonberger, Herbert Kopfer, and Dirk C. Mattfeld 

University of Bremen, Faculty of Economics, Chair of Logistics, D-28334 Bremen, 
Germany, {sberger,kopfer,dirk}@logistik.uni-bremen.de 



Abstract. In this article we propose a model which decides upon the execution 
of transportation requests in order to maximize their contribution to the overall 
profit. Positive contributions of requests are obtained by composing appropriate 
routes for the vehicles involved. The composition of routes is hindered by limited 
capacity of vehicles and time windows associated to requests. A hybrid approach 
consisting of a parallel path construction heuristic which seeds a Genetic Algorithm 
is presented. We assess its capability for a set of suitable benchmark instances for 
pickup and delivery problems with time windows. 



1 Introduction 



In a pickup and delivery problem vehicles have to visit a set of locations 
in order to transport goods from origins to destinations. A request specifies 
the pickup and the delivery location and the volume respectively weight of 
goods to be transported. Each request must be executed by one vehicle. The 
minimization of execution costs is impeded by maximal vehicle capacities and 
often it is compromised by time windows. 

In this contribution the well-known Pickup and Delivery Problem with Time 
Windows (PDPTW) [4] is extended by a decision concerning the acceptance 
of orders. It is aimed at identifying a selection of requests that leads to a 
maximal profit contribution, defined as the difference between the revenues 
and costs in monetary units. The revenue of a single request typically does not 
cover the costs for its execution so that the combination of several requests 
in a route is necessary to achieve positive contributions. The fulfillment of 
requests with negative contributions to a route is allowed to be rejected. 

The incorporation of the selection decision into the route generation task has 
received only minor attention so far. Some contributions deal with delivery 
or collection problems with one [5,6] or several vehicles [3]. For pickup and 
delivery problems only one single vehicle case is investigated [9]. 

Next, an exact description of the investigated problem is given followed by 
the presentation of a hybrid Genetic Algorithm. Its capability is assessed by 
means of suitable benchmark instances. 




151 



2 The Pickup and Delivery Selection Problem 

To fulfill the available customer requests a collection of routes for the m 
vehicles within a fieet T is to be determined. 

Every customer request is specified by a quadruple {PUi^ DLi.Ci^ REVi). 
The pickup activity PUi takes place at whereas the delivery activity DLi 
is demanded at location p~ . A time window [tmin^imax] is specified for each 
activity. Load of volume Ci is to be picked up at pf and to be delivered to 
p ~ . A revenue REVi is associated with and the costs of moving along one 
distance unit is one money unit. 

The available customer requests form the set 7^ := {r^ | z == 1, . . . , n}. The set 
of all involved locations is denoted by V : — XUT U {p^ , . . . , } U {p^ , . . . , } , 

where I consists of the initial positions of the vehicles and T of the depots. 
Feasible Routes. An operation is a triple tt (p, o^(p),e(p)), where p rep- 
resents the location of a pickup, a delivery, a start or a stop activity. a{p) 
carries the arrival time of the vehicle at location p and e{p) the leaving time 
from p. If the vehicle arrives at p before the associated time window has been 
opened it has to wait until the earliest allowed operation time tminM for tt 
has passed. The leaving time of tt has to precede the latest allowed operation 
time tmax(7r). 

A sequence of operations II = (tti, . . . , TTnn) is called a route. In the remain- 
der of this article the first component of tt^ is denoted by Pi. Each route 
includes exactly one temporally unrestricted dummy request of zero capac- 
ity and zero revenue, that describes the initial and the final operation of a 
vehicle. The route U includes Nn requests. 

Let be the travel time between pi and Pi+i. The arrival and the leav- 

ing times are calculated recursively. We have a{p\) = e{pi) := 0 and a{pi) := 
e(pi_i)+tp._i,p.. The leaving time is achieved by e{pi) := max{a(pi), tminM} 
The vector = {q^ , . . . , qj^^) describes the volumes that are collected along 
the route 77. For a pickup operation at Pi it is qi >0 and for the associated 
delivery operation at pj we define qj := —qi. The capacity usage along 77 is 
determined recursively: Ci := 0 and cP_i + qi-\. 

The route 77 is called pd-path for vehicle v if it the following restrictions hold 
[8]. Either both operations of Ti or none of them are contained in 77 (pairing), 
a pickup operation precedes its associated delivery operation (precedence), 
the maximal load is not exceeded for all i: cf < (capacity) and the 

leaving time for operation tt^ lies in the specified time window: tmin(Pi) < 
e(Pi) < tmax(Pi) (time window). 

For the fulfillment of 77 one monetary unit per distance unit driven is taken 
into account leading to costs Cn for route 77. 

The total received revenues En for 77 are given by En . 

A pd-schedule S' is a set of pd-paths 77^,..., 77^ so that each customer 
request is assigned to at most one path. 

Optimization Problem. For each vehicle k the set Sk of possible pd-paths 
is finite so that a set partition formulation [10] of the problem at hand is 





152 



appropriate. From the set S := Ukej^Sk of pd-paths we select the routes that 
forms the schedule S. The following binary decision variables are introduced. 
Let Xi = 1 ^ route i is in the schedule S and Yik = I route i is served 
by vehicle k. For the binary constants zij is assumed that zij = 1 route 
i contains customer request j. The Pickup and Delivery Selection Problem 



(PDSP) is than defined as shown in (la)-(lf). 




max ^ Ei ' Xi Ci ' Xi 


(la) 


ies 




5] Fife = 1 Vfc € jr 


(lb) 


ieSk 




Y, Yik = 0yk£T 


(Ic) 


iES\Sk 




Y Yik ^Xi^ieS 


(Id) 






Yzij-Xi<iy j en 


(le) 


i£S 




Yik,Xi€ {0,1}. 


(If) 



Each vehicle k is assigned to exactly one pd-path (lb) so that the maximal 
capacity of k is not exceeded (Ic), each pd-path is selected at most once (Id) 
and no request is contained in more than one selected pd-path (le). 
Benchmark Problems. A revenue is added to the PDPTW-instances in- 
troduced in [7]. Let 77^, . . . ,11^ be an optimal or near optimal solution of 
such an instance. Let , . . . , the requests assigned to vehicle k. Then 

Pii ■■= CnJ{d{pX ,pJ^) + ... + ^ , pr J) for all j = 1, . . . , nk is the charge 

for moving along one distance unit with vehicle k. The function d(*, •) de- 
scribes the distance between two locations. The total revenue REVi is then 
determined as REVi Pi - Ci • d(p+,p^). The solution of the PDPTW- 
instance is then evaluated as a PDSP solution calculating the overall profit 
contribution. The achieved value serves as a reference value. 

We derive PDSP problems from 40 different PDPTW-instances and generate 
several classes of instances determining the maximal capacity of the vehicles 
with = 20,40, ...,200. Now, 40 • 10 = 400 benchmark instances are 

available. 

3 Hybrid Algorithm 

To solve the PDSP we propose a Genetic Algorithm (GA) that is seeded by 
a parallel route construction heuristic. 

Schedule Construction. The schedule construction heuristic consists of 
three phases. At first the operations are ordered along the time axis, then 





153 



they are distributed among the vehicles and finally requests that violate the 
capacity or the time window restriction are removed from the routes. 

Phase 1 (Slot-Assignment): The relevant part of the time axis is partitioned 
into s > 1 equidistant slots. Each operation is assigned to the slot in which its 
latest permissible leaving time falls. It is ensured that each pickup operation 
precedes its corresponding delivery operation. 

Phase 2 (Routing): The sequences in which the requests are taken and the 
order in which the vehicles are checked are not externally determined. Let 
R be a request sequence and V be a vehicle sequence. Each request from 
R is assigned to the first vehicle from V with free pickup and delivery slot. 
A request that cannot be assigned to any vehicle remains unserved. The 
depot situated as near as possible to the last delivery operation of a vehicle 
is selected as its terminating point. The obtained routes 77^, ... , 11^ hold for 
the pairing and the precedence condition. The load of the vehicles and the 
leaving times of the operations are determined as described in section 2. 
Phase 3 (Obtaining Feasibility): Feasibility concerning capacity and time win- 
dows is obtained greedily. Along the time axis each scheduled operation is 
checked. If a capacity or a time window constraint violation is asserted the 
currently considered request is removed from the route. 

Adaptive Improvement. A Genetic Algorithm [1] is configured to improve 
the pd-schedules generated from the heuristic described above. 

Initial Population. The application of the schedule generation heuristic with 
randomly generated request and vehicle permutations produces an initial 
population of feasible pd-schedules. 

Schedule Encoding. A schedule is represented in a string (p, d, v) of length 
2n + m H- m. p carries a permutation of all customer locations. The z-th 
component of d carries the number of the terminating depot assigned to 
vehicle i. The z-th component of v contains the number of requests assigned 
to vehicle z. 

Schedule Decoding. The first vi pickup operations in p and the corresponding 
Vi delivery operations form the route of the first vehicle. For the second vehi- 
cle the next V 2 pickup operations and their corresponding delivery operations 
are selected and so on. Operations of rejected requests are carried in the last 
permutation positions. Conditions (Ib)-(le) are kept syntactically. Following 
the above representation each individual represents a pd-schedule if it can 
be ensured that the routes obtained from p are pd-paths. Therefore, each 
route is greedily checked for capacity and time window constraint violations. 
Requests that cause constraint violations are removed from the routes. The 
genetic information is immediately updated and fed back into the population. 
Cross-Over. Applying the cross-over operator PPX [2], a new permutation of 
operations is generated from two parental permutations. This leads to a new 
collection of routes, each satisfying the pairing and the precedence restriction. 
Prom the same parents a new depot sequence is achieved applying a standard 
uniform cross-over operator to the parental depot sequences. To derive a new 





154 



request assignment sequence we assign a modified uniform-cross-over operator 
to the last segment of both parental individuals. The modification ensures 
that at most n requests are distributed among the vehicles. Cross-over of the 
third segment changes the number of accepted customer requests. 

Mutation. Two arbitrarily chosen alleles are swapped within each segment. 
Pairing and precedence feasibility are preserved. Additionally, the number 
of requests within the routes is increased. Therefore, the so far unselected 
requests are randomly distributed among the vehicles. 

Selection. A /i -h A scheme [1] is used to derive a new population of individ- 
uals. The best individuals from the union of the offspring set and original 
population form the new population. 

Evaluation. Since all individuals are feasible with respect to (Ib)-(lf) a suit- 
able fitness value is the objective value obtained by (la). 

4 Computational Investigations 

Two experiments have been performed to compare the capability of the adap- 
tive approach with a non-adaptive biased random search (BRS) heuristic. 

In the BRS experiment a request permutation in which the most promis- 
ing requests are averagely found in the first components is instantiated as 
follows. For each request ri the standardized revenue pi (cf. the benchmark 
paragraph) is calculated. Then, the components are successively determined 
by a roulette- wheel-selection, which prefers requests with large pi. A selected 
request is not considered in subsequent selection steps. Altogether 2000 inde- 
pendent permutations have been generated. Since all vehicles offer the same 
capacity a random vehicle permutation is used. 

The results of BRS are compared to those that are obtained from our hybrid 
genetic algorithm (HGA) seeded with a set of pd-schedules obtained from the 
BRS-experiments. The GA is configured for 100 individuals and generates 200 
generations within each run. For each PDSP-instance ten independent runs 
have been executed. Best results are obtained for a cross-over probability of 
0.6 and a mutation probability of 0.3 that means three of ten offsprings are 
mutated. 



Table 1. results obtained for different capacities 





capacity 


20 


40 


60 


80 


100 


120 


140 


160 


180 


200 


BRS 


best, obj 


0.25 


0.40 


0.46 


0.49 


0.52 


0.53 


0.54 


0.54 


0.55 


0.55 




best, acc 


0.34 


0.44 


0.49 


0.50 


0.52 


0.54 


0.54 


0.54 


0.54 


0.54 


HGA 


avg. obj 


0.32 


0.52 


0.56 


0.60 


0.62 


0.64 


0.65 


0.65 


0.65 


0.65 




avg. acc 


0.38 


0.47 


0.52 


0.53 


0.55 


0.56 


0.56 


0.57 


0.57 


0.56 



Tab. 1 shows the obtained results consolidated for different capacities. The 
values are compared to those derived from the solutions of the PDPTW in- 




155 



stances from [7]. For each capacity the averagely best solutions found are 
shown together with the corresponding ratio of accepted requests. For in- 
creasing vehicle capacities the objective value is doubled up to a capacity of 
100. For larger resources no significant improvements are observed. 

The results from BRS are outperformed by those obtained from HGA. The 
objective value is doubled up to capacity of 120. Further capacity enlarge- 
ments do not lead to significantly higher values. The number of selected 
requests is slightly enlarged in the HGA results. In both experiments the 
quota of selected requests remains averagely unimproved if the capacity is 
enlarged to more than 120 units. 

5 Conclusion and Future Work 

In this contribution a formal model for the selection of the requests with 
maximal positive contributions to the overall profit has been proposed. The 
obtained results from a permutation controlled parallel construction heuristic 
can be improved by a Genetic Algorithm. 

Future research will be dedicated to the identification of promising request 
permutations to generate pd-schedules of improved quality. 
Acknowledgement: This work was supported by the BMBF Project SpiW, 
No. 01HT0144. 

References 

1. Back, T., Fogel, D.B., Michalewicz, Z. (2000) Evolutionary Computation 1 - 
Basic Algorithms and Operators, Institute of Physics Publishing 

2. Bierwirth, C., Mattfeld, D.C., Kopfer, H. (1996) On Permutation Representa- 
tions for Scheduling Problems, Proc. of PPSN IV, Springer, 310-318 

3. Butt, S.E., Ryan, D.M. (1999) An optimal Solution procedure for the multi- 
ple tour maximum collection problem using column generation. Computer Sl 
Operations Research 26, 427-441 

4. Dumas. Y., Desrosiers, J., Soumis, F. (1991) The pickup and delivery problem 
with time windows, European Journal of Operational Research 54, 7-22 

5. Feillet, D. (2001) Traveling Salesman problems with profits: an overview. Pro- 
ceedings of OPR^, Paris 

6. Laporte, G., Martello, S. (1990) The Selective Travelling Salesman Problem, 
Discrete Applied Mathematics 26, 193-207 

7. Nanry, W.P., Barnes, J.W. (2000) Solving the pickup and delivery problem with 
time windows using reactive tabu search. Transportation Research Part B 34, 
107-121 

8. Savelsbergh, M.W.P., Sol, M. (1995) The General Pickup and Delivery Problem, 
Transportation Science 29, 17-29 

9. Verweij, B. Aardal, K. (2000) The Merchant Subtour Problem, technical report 
UU-CS-2000-25, Universiteit Utrecht 

10. Williams H.P. (1999) Model Building in Mathematical Programming, 4^^ Edi- 
tion, Wiley, Chichester 





VRP with Interdependent Time Windows - 
A Case Study for the Austrian Red Cross 
Blood Program 



Karl Doerner^, Manfred Gronalt^, Richard F. HartF, Marc Reimann^, and 
Kerstin Zisser^ 

^ University of Vienna, Department of Management Science, Bruenner Strasse 72, 
A-1210 Vienna 

^ University of Agricultural Sciences Vienna, Department of Forestry and Timber 
Industry, Gregor Mendel Strasse 33, A- 11 90 Vienna 



Abstract. This work shows our endeavors to assist the fleet management division 
of the Austrian Red Cross blood programme by developing an algorithm for the 
VRP with interdependent time windows. Usually, blood donation campaigns are 
operating all day long and the collected blood must be transported from campaign 
locations to a central blood processing facility, where it has to be further processed 
within a certain amount of time. By considering this constraint, each location must 
be visited for a number of times during the course of a day. Certain vehicles are 
scheduled for picking up the stored blood donations and delivering it to the pro- 
cessing facility within the given time window. Time windows for each location show 
a dynamic behaviour as they depend on previous pickup times. In order to solve the 
problem we develop a savings-based constructive procedure for the VRP with inter- 
dependent time window constraints. First, our algorithm calculates the minimum 
number of pickups for each location and then it determines possible combinations 
when constructing a tour. Preliminary results on real life data show that we can 
easily decrease overall tour lengths. 



1 Introduction 

“Everybody knows giving blood saves lives [1]” . But before blood reaches vul- 
nerable patients suffering from cancer or injuries from an accident, it passes 
a highly complex screening, processing, testing, transporting and monitor- 
ing process. The blood collection process of the Austrian Red Cross blood 
program is divided up regionally, where each region plans and schedules its 
blood donation campaign independently of the others. 

For collecting the donations a number of districts and locations usually 
spread in the area of Eastern Austria is selected. To each of the locations a 
team is assigned for a specific daily duration to organize the blood donation. 
They collect the donations and provide for a proper storage of these. A blood 
donation campaign usually takes place at the weekend. 

Collected blood is transported from the campaign locations to a central 
blood processing facility, where it has to be further processed and separated 




157 



into its component parts (Red Blood Cells, Platelets, and Plasma) within 
about 300 minutes. 

Currently, the blood transport is manually scheduled and the resulting 
tours are registered on a sequential list. By doing this, possible combinations 
of tours are only found if they obviously can be put together, e.g. locations 
are resided along the same highway etc. In our work we both integrate a GIS 
which is already in use for locating blood donators in case of emergencies 
and develop a procedure for finding a minimum travel cost solution for the 
present VRP. 

The problem characteristics can be stated as follows: 

Any blood donated must arrive at the processing facility at most 300 
minutes after donation. Campaign durations are typically longer than 
300 minutes. Thus, each campaign needs to be visited more than once. 
For each visit a time window applies, which is reset every time a vehicle 
leaves a particular location because of time window interdependence. 
The last donations are taken back to the processing facility by the people 
organizing the campaign. 

The goal is to find minimum cost tours and to allocate the appropriate 
transport devices to take back all the blood. 

For calculated solutions an eligibility check with the people at the control 
station is done. Further, rendezvous options are considered in order to 
improve the solution found. 

We develop a savings-based constructive solution procedure for the VRP with 
interdependent time windows. In the initialization phase the minimum num- 
ber of pickups and the current slack time are determined for each location. 
This leads to a multiplication of locations and their respective time windows 
in the used construction graph. In the construction phase locations are added 
to a tour according to the savings values and the current slack times. 

The problem considered in this paper is related to the periodic vehicle 
routing problem with time windows. Cordeau et al. ([5]) apply the unified 
tabu search heuristic to solve this problem, while Baptist a et al. ([6]) apply 
a heuristic approach to a real world period vehicle routing problem for the 
collection of recycling paper containers. 

For the problem class of the period travelling salesperson problem, where 
over a period of time each customer must be visited at least once with some 
customers requiring several visits, Chao et al. ([2]) and Paletta ([3]) developed 
heuristic approaches, while Drummond et al. ([4]) solve this problem class by 
using a parallel genetic algorithm and local search heuristics. These problems 
do not have time window and maximum tour lengths constraints. 

Yi and Scheller-Wolf ([7]) design an exact solution procedure based on 
integer programming for a vehicle routing problem with time windows faced 
by the American Red Cross. In their approach the objective is to bring a 
predetermined amount of blood from the campaign locations back to the 





158 



processing facility at minimum cost, where the amount collected at each cam- 
paign depends on the time of the visit. They consider only a single visit per 
campaign location thus eliminating interdependence between time windows. 
Still, a maximum tour length exists due to perishability of blood. 

2 Model formulation 

We use the following data and indices for our model: 

ij... blood donation locations (BDL) 

0. .. blood bank (BB) 

dio... distance from BDL i to BB 
dij... distance from BDL i to BDL j 

a... one BDL or an already assigned feasible combination of BDLs 
sq... service time at BB 

Si... service time at BDL i 

ai, 6i... starting/finishing time of the blood donation event at BDL i 
T... max. time span between blood donation and processing at BB 

Based on these data we can compute the following values: 

tI — t — Si — dio — So maximum time period between two collections of 
blood donations at BDL i 

j^. = a — t[ — ai relevant time span to collect the donations at 
BDL i taking into account that the donated blood 
at the end of the blood donation event is returned 
by the organizing team of the event, which also 
returns to the blood bank. 

rrii = number of required blood pick-ups at BDL i 

— rrii' t[ - Pi slack time at BDL i; results from the required 
integer number for the blood pick-ups 
^max — ai r- latest arrival time for the first blood pickup 
^max _ -f tI latest arrival time for the k-th blood pickup 

earliest arrival time for the k-th visit at BDL i 

To ensure feasibility three constraints are particularly important: 

1. Constraint (Detour Constraint): 

Pa ^ ~da0 dam dmO (1) 

Adding a pickup at BDL m to partial tour a is possible only, if the slack 
time of the partial tour is larger than the time of the detour associated with 
adding BDL m. 

2. Constraint (Reachability Constraint): 

j.min 1 o 4 _ fTna: 

^ ^am ^ ^mn 



( 2 ) 





159 



Adding the n-th pickup at BDL m to a partial tour a is possible only, if the 
campaign m can be reached on time. 

3. Constraint (Punctuality Constraint): 

^max -I- 5 ^ -f > fmtn ^3^ 

Adding the n-th pickup at campaign m to a partial tour a is possible only, if 
all the blood collected on the partial tour so far still arrives at the processing 
facility on time. 

Note, that the detour constraint (constraint 1) and the punctuality con- 
straint (constraint 3) are both necessary, as the detour constraint evaluates 
the detour, whereas the punctuality constraint ensures that donated blood 
already picked up does not perish as a cause of the detour. 

3 Heuristic solution procedure 

Let us now turn to the approach we propose to solve the problem with inter- 
dependent time windows. We will also discuss implementation issues. 

• Step I: Read input data and initialize system 

Initially the blood donation campaign information and the distance in- 
formation are read. The distance information is computed by the used 
CIS. For each BDL all the relevant information (e. g. the relevant time 
span to collect the donations, the number of required blood pickups, the 
slack time,...) is computed. 

• Step II: Compute Savings matrix 

By using the distance information the Savings values between all BDLs 
are computed. The Savings algorithm, proposed in [8] is the basis of most 
commercial software tools for solving VRPs in industrial applications. It is 
initialized with the assignment of each BDL to a separate tour. After that 
for each pair of BDLs i and j the following savings measure is calculated: 



Sij — diQ -}- d()j dij (4) 

where dij denotes the distance between BDLs i and j and the index 0 
denotes the BB. Thus, the values Sij contain the savings of combining 
two pickups at BDL i and BDL j on one tour as opposed to serving them 
on two different tours. 

• Step III: Determine feasible combinations 
Having computed the Savings values for each BDL, we check the con- 
straints (detour constraint, reachability constraint, punctuality constraint) 
Thus, we can check if there are any feasible combinations between two 
BDLs. Note, that at each BDL a number of visits may be necessary, 
such that combinations between all of these visits have to be checked. A 
combination is only feasible if all the constraints are satisfied. 




160 



• Step IV: Find combinations 

In the iterative phase of step IV pickups at DDLs are combined. We 
start to combine a feasible combination of two BDLs according to the 
earliest possible pickup time. Feasible combinations with a Savings value 
below a certain threshold are not considered. After the realisation of the 
combination of BDL i and BDL j, the time windows for the next pickups 
of these two BDLs and also the list of the remaining possible combinations 
are updated. This step is repeated until no combination with a savings 
value greater than a given threshold exists. 

• Step V: Report results 

The combinations of BDLs, the number of required pickups and the 
tourlength are reported. The tours are visualized by our commercial desk- 
top GIS. 

4 Numerical example 

In the following section, we present a numerical study that applies our ap- 
proach described previously by using real world data from the Austrian Red 
Cross. Our example considers a typical day with 13 BDLs in the eastern part 
of Austria. The required number of pickups at one BDL varies between zero 
and three. The daily duration of the BDLs are given. 




tliTesliold modific&tiona 



Fig. 1. Numerical Example 



In our algorithm only combinations of two BDLs are considered. In Fig- 
ure 1. the solution quality obtained by using different thresholds ranging from 
10 (column 2) to 180 minutes (column 10) is shown. The tour length by as- 
signing each pickup at a BDL on a separate tour is 2606 minutes (column 
1). The minimum tour length in this example is achieved using a threshold 
between 30 and 70 (column 4) and equals 1795.23 minutes. 

By choosing a threshold that is too small we can see that the solution 
quality decreases also. The reason is that the realisation of combinations 
with an early pickup time and a small Savings value prevents the realisation 
of other promising combinations. 





161 



On the other hand, by choosing a threshold that is too large, we reject 
too many promising combinations and thus solution quality decreases. 

Note, that we can also decrease the costs in comparison with the real costs 
but due to contractual agreements we are not allowed to report the solution 
quality of the Austrian Red Cross. 

5 Conclusions 

In this paper we have proposed a model and a heuristic solution approach for a 
real world scheduling problem faced by the Austrian Red Cross. The approach 
can be used to instantly a candidate solution. By running the algorithm a few 
times with different thresholds, the user is provided with different solutions 
to choose from. 

Starting from our initial approach, we are now examining strategies based 
on relaxing the assumption to visit each campaign the minimum number of 
times. Further, we plan to integrate a rendezvous option where two vehicles 
can exchange loads in order to save unnecessary travelling. 

Acknowledgments 

Financial support by grant #8630 from the Oesterreichische Nationalbank 
(OeNB) is gratefully acknowledged. 

References 

1. www.redcross.org/services/biomed/blood/learn/process.html 

2. Chao, I.-M., Golden, B. L. and Wasil, E. A., A new heuristic for the period 
traveling salesman problem, Computers Sz Operations Research 22 (1995), pp. 
553-565. 

3. Paletta, G., The period traveling salesman problem: a new heuristic algorithm, 
Computers Sz Operations Research 29 (2002), pp. 1343-1352. 

4. Drummond, L. M. A., Ochi, L. S., Vianna, D. S., An asynchronous parallel meta- 
heuristic for the period vehicle routing problem. Future Generation Computer 
Systems 17 (2001), pp. 379-386. 

5. Cordeau J.-F.; Laporte G.; Mercier A., A unified tabu search heuristic for ve- 
hicle routing problems with time windows. Journal of the Operational Research 
Society 52 (2001), pp. 928 - 936. 

6. Baptista, S., Oliveira, R. C. and Zuquete E., A period vehicle routing case study, 
European Journal of Operational Research 139 (2002), pp. 220-229. 

7. Yi, J. and Scheller-Wolf, A., A Vehicle Routing Problem with Time Window 
Constraints and Variable Rewards: An Application in the American Red Cross. 
Research Report. 

8. Clarke, G. and Wright, J. W., Scheduling of vehicles from a central depot to a 
number of delivery points. Operations Research 12 (1964), pp. 568-581. 





Incident Management Based on Real Time 
Simulation 



Jurgen Zajicek, Martin Linauer, Katja Schechtner 
arsenal research 

Austrian Research and Testcentrum Arsenal Ges.m.b.H. 
Business Area Transport Technologies 
A- 1030 Vienna, Faradaygasse 3 
phone: +43-(0)50550-6610, fax: +43-(0)50550-6613 
e-mail: juergen. zai icek@arsenal ac . at 



Abstract 

The main function of a traffic simulation which is used to evaluate the current 
traffic situation is reflected in the spatio-temporal interpolation of the situation be- 
tween the stationary detector points of any road network. 

Today the traffic information systems are mainly based on traffic data from 
fixed metering points using technologies like induction loops, radar or infra red 
sensors, etc. Due to this static information current traffic information systems fea- 
ture no or only meager spatio-temporal portability. 

The following figure will show how simulation helps to fill up the scopes with- 
out traffic information using the Floating Car Data (FCD) method to get area- 
wide traffic information. 

To be able to employ a dynamic traffic metering method a redesign of simula- 
tion algorithms has to take place in the near future. The Floating Car Data method 
will for the first time enable the creation of area-wide traffic information without 
expensive infrastructure and allow a rapid reaction onto current incidents on the 
road network regarded. 

But this method presupposes simulation algorithms using variable time steps 
instead of constant time steps like the prevalent software offered today. Only on- 
line simulation software (macro simulation software) can evaluate traffic data sent 
by moving vehicles at different times and positions. 

This ability of simulation software should not be reserved to the macro layer 
because of the great influence of incidents onto the traffic situation that can only 
be registered in the micro layer. 

To prepossess this requirement interfaces between both layers must be realised 
to enable a bi-directional data exchange. However there is a lot of research work 
required to solve these specified problems. 

This lecture should help to understand the existing problems and show the pos- 
sibilities opening by using this new simulation method in combination with Float- 
ing Car Data in incident management. 




163 



Introduction 

Considering the daily situation on the road in and around the congested city ar- 
eas you will recognise that the traffic jams are getting worse and worse each day. 
To anticipate the impending collapse of traffic systems, strategies to manage the 
emerging problems have to be developed. But before starting a management sys- 
tem you first have to know what the current situation on the road network looks 
like. 

Today the traffic information systems are mainly based on traffic data from 
fixed metering points using technologies like induction loops, radar or infrared 
sensors, etc. In Austria there are only 156 official points placed on the main routes 
of the road network comparing to about 8000 stations in Germany. 

Due to this static information current traffic information systems feature no or 
only very limited spatio-temporal portability. A possible way out of this dilemma 
may be the use of a traffic simulation to fill “unknown” areas with information 
about the situation on the relevant roads [3]. 

However, a clincher of this idea must receive attention: with increasing dis- 
tance to the metering station also the accuracy of the simulation of the traffic 
situation of the area between two metering stations will decrease to an unaccept- 
able level. A relief to this problem will be the use of data from “Floating Car 
Data” vehicles to “feed” the simulation supplementary. 



Principle of Floating Car Data (FCD) Acquisition 

In floating car data applications (FCD services), the vehicle speed, in conjunc- 
tion with the vehicle position and time, is recorded and transmitted to a service 
center in defined time intervals in order to generate traffic information (see figure 
1). The time for transmitting the FCD data to the service center can be event con- 
trolled, or sent by a fixed time interval (e.g. 1 minute). Combining floating car 
data from different vehicles, the service center draws conclusions depending on 
the kind of FCD service, and sends these conclusions back to the service subscrib- 
ers. Typical services are traffic information, like detection of traffic jams, informa- 
tion on traffic flow (level of service) and the information on the travel time. FCD 
services are expected to become a mass market in the future, resulting in a very 
high number of cars taking part. 





164 




Car enters traffic jam 



fkm/h] 

12C 



v(lc) 



Vetiicb speed 



A 



Traffic flow related 
parameters, events 



Ctjrrer^t Speed 
V = 125 km/h 



Figure 1: Principle of Floating Car Data Acquisition [1] 



The floating car data applications are not only based on satellite navigation like 
GPS (EGNOS, GALILEO). It is also possible to generate position and time by a 
GSM Network by tracking the Cell ID i.e. to generate the Floating Car Data. In 
Austria the mobile penetration is at 84%. This means that in most cars the driver 
will have a cell phone. This way of localisation does not have the same quality as 
GPS but arsenal research develops some FCD models which are designed for us- 
ing mobile Cell ID [2]. 



Integrate FCD in Real Time Simulation Models 

Implementing this new kind of measuring traffic situations the next problem 
appears in the fact that the data do not arrive in fixed time steps in the traffic cen- 
ter and also is from different places. To use both kinds of data sources a special 
data management system must be established. The conventional data will be han- 
dled like in the past. The floating car data has to be assigned to road sections be- 
longing to the near array of the co-ordinates of the current data point (Figure 2). 







165 




speed {km/h] 



Figure 2. Positions of FCD-vehicles and their current speed 

The position of each observed FCD-vehicle will be updated in fixed time steps. 
With the distance and the time gap between two following positions the speed of 
the vehicle can be calculated. Figure 3 shows the dedicated velocities to road sec- 
tions of FCD vehicles in the urban area of Vienna. 



Some critics may ask in view of FCD “Why do we need traffic simulation?”. 
Today there are only a few cars equipped with FCD-devices. Due to this circum- 
stance a fast and reliable traffic simulation is needed to manage different traffic 
situations. Based on the combined data of the different sources a meaningful traf- 
fic simulation has to be able to give prognoses about possible situations in the near 
future. 

Metering stations can only detect some incidents on the road network when the 
congestion reaches back to its site. The network of metering stations in the area of 
Vienna is not able to detect incidents in proper time to set reactions to manage the 
changed situation because of the large distances between them. This will be possi- 
ble if traffic simulation algorithms can handle vehicle volume (veh/h or 
veh/15min, etc.) and speed values dedicating to different road sections at the same 
time. 







166 




vehicle speed relative to the permited speed [%] 



Figure 3: Dedicated velocities of FCD-vehicles in the urban area of Vienna 



Only macroscopic simulation algorithms using variable time step will be able to 
achieve these postulations because they are able to evaluate traffic data sent by 
moving vehicles at different times and positions. This ability of simulation soft- 
ware should not be reserved to the macroscopic layer due to the fact of the mas- 
sive influence of incidents onto the traffic situation that can only be registered in 
the micro layer. 

To fulfill this requirement interfaces between both layers must be realised to 
enable a bi-directional data exchange. Known companies offering traffic simula- 
tion software plan to reorganise their algorithms from fixed to variable time steps. 
However, there is a lot of research work and man power required to solve these 
specified problems. Due to this fact there will be some products available in two 
to three years. 

A fast and reliable evaluation of the current traffic situation is the presupposi- 
tion of an efficient traffic information system to re-route vehicles on congested 
roads. This would mean that the vehicle, which would otherwise be stationary, or 
driving slowly through traffic jams, can better be distributed onto the low level 
road network. 

In case of this the environment will not be so heavily bonded in future like to- 
day. 




167 



CONCLUSION 

The potential to improve the quality of Real Time Simulation and Incident 
Management by combining data from fixed metering points (inductive loops, ra- 
dar, laser or infrared sensors) and Floating Car Data is enormous. 

On the one hand conventional metering points are very important for long and 
short term traffic prognoses who needs data from the vehicle volume. On the other 
hand Floating Car Data is the best technology for locating incidents and traffic 
jams in the road network. The possibility to accurately predict journey time is also 
an integral part of FCD application. 



REFERENCES 

[1] Huber, W.: XFCD and Local Danger Warning. Paper N°. 2032, 6 ^^ Intelligent Transport 
Systems World Congress. Toronto, Canada 1999 

[2] Linauer, M.: Verkehrssteuerung per Handy - Kombination von Telekommunikation 
und Informatik eroffnet neue Wege der Verkehrssteuerung, Schriftenreihe Forschungs- 
news 3/2001, arsenal research 

[3] Zajicek, J., ONLINE-Simulation-Eine Chance fur den innerstadtischen Verkehr, IT'S 
T.I.M.E., Heft 1, ARC Austrian Research Centers GmbH, Wien 2001 

[4] Daduna, J. R., Vofi St., Informationsmanagement im Verkehr, Physica Verlag, Heidel- 
berg 2000 

[5] OSSA Consortium, Open Framework for Simulation of Transport Strategies and As- 
sessment - User and Policy Requirement, GROWTH PROJECT, 2000 




Online-Dispatching of Automobile Service 
Units 



Martin Grotschel^, Sven 0. Krumke^, Jorg Rambau^, and Luis M. Torres^ 

Konrad- Zuse-Zentrum fur Informantionstechnik, Berlin (ZIB) 

Takustrafie 7, 14195 Berlin-Dahlem, Germany 
e-mail: {groetschel,krumke,rambau,torres}@zib . de 



Abstract. We present an online algorithm for a real-world vehicle dispatching 
problem at ABAC; the German Automobile Association. 



1 Problem Description 

The German Automobile Association ABAC {Allgemeiner Deutscher Auto- 
mobil-Club)^ the second largest such organization in the world, surpassed only 
by the American Automobile Association, maintains a heterogeneous fleet of 
over 1,600 service vehicles in order to help people whose cars break down on 
the road. All ABAC service vehicles (called units in the following, for short) 
are equipped with a GPS system which allows to locate their positions at any 
time. In five help centers {Pannenhilfezentralen) distributed throughout Ger- 
many, human dispatchers have to reply to incoming help requests {events) 
instantly. Their task is to assign a unit to serve each customer and to pre- 
dict the estimated time of arrival at the customer’s location. In addition to 
the ABAC fleet, about 5,000 units operated by service contractors can be 
employed to cover events that otherwise could not be served in time. 

There is no unique objective. The goals are high-quality service (e.g., short 
waiting times for the customers) and low operational costs (e.g., short tour 
lengths and small overtime costs). With increasing costs and request volume, 
ABAC’S dispatching system has come under stress. The task is to design an 
automatic system that guarantees small waiting times for events and keeps 
operational costs low. Such a system must address two different issues: 

Realtime-Problem Given a “snapshot” of the situation at some moment 
in time, compute an optimal dispatch for attending all pending events with 
the available units and (if required) contractors. Since it contains the classic 
vehicle-routing problem as a special case, this problem is NP-hard. The diffi- 
culty is that such a dispatch has to be returned in a very short time, usually 
no more than 15 seconds for a system load of about 200 events and 100 units. 



Online-Problem Once an algorithm for the first task has been found, design 
a good strategy for embedding it within the constantly running dynamic 




169 



planning process, i.e., decide how often a new dispatch should be computed 
in response to incoming events, and/or how far changes should be admitted 
in a previous computed dispatch. The main challenge lies in the impossibility 
to predict if, where and when events in the near future will take place. 

The modeling of the first task, which we shall in the following call the 
vehicle dispatching problem Vdp, involves many technical and organizational 
side constraints — some hard, others soft — , and it takes some time to figure 
out which restrictions and objectives really count. As one example, we have 
considered constraints arising from a management decision: ADAC’s imposi- 
tion of a soft deadline on the service time of an event, which may be missed 
at the cost of a linearly increasing lateness penalty (soft time windows). 

Using an approach based on column generation, it was possible to design 
an optimization algorithm for the Vdp capable of finding optimal or near- 
optimal solutions for real-world instances and which was compliant with the 
real-time requirements of the problem. In section 3, we give a brief descrip- 
tion of the algorithm (see [5] for the details). The main purpose of this paper 
is to present our first results concerning the online-problem. In section 2, we 
describe the criteria we used for evaluating different possible online solution 
schemes and introduce the strategy that we have chosen. Experimental com- 
petitive results for the application of this strategy on real-world instances are 
reported in section 4. Section 5 summarizes the key points of this paper. 

2 Online Strategy 

We postpone until the next section the discussion about how the Vdp in- 
stances corresponding to snapshot situations are solved and focus first on the 
more subtle online-problem: the search for a strategy to carry out the actual 
planning without knowing where future events will pop up. A decision that 
is “optimal” at some point in time can prove later to have been unwise. In 
particular, even if we were able to compute locally optimal dispatches for any 
snapshot situation this does not mean that we obtain a dispatch which is (in 
hindsight) optimal for the whole planning period. 

A by now standard tool for measuring the “goodness” of an online algo- 
rithm is competitive analysis [6]. Basically, one compares the solution pro- 
vided by the algorithm for an instance of the online problem with the solution 
provided by a “hindsight” adversary, which solves the same problem instance 
but with knowledge of the whole input data in advance. It is usually almost 
impossible to obtain theoretical proofs of (useful) competitive results, except 
for elementary problems. Nevertheless, the concepts arising from competitive 
analysis can still be used in practice, in the form of experimental a-posteriori 
analysis of online strategies (see, e.g., [2] for some real-world examples), pro- 
vided a measured hindsight adversary can be found. 

In our case, this a-posteriori analysis was carried out over input data col- 
lected during one (resp. two) hour(s) of operation of the current system at 




170 



ABAC. This data was fed to our algorithm in a way that simulated how it 
would occur in practice: each event was labeled by a time-stamp that indi- 
cated the moment at which it became known to the system. Only after that 
time the algorithm was allowed to incorporate information from this event 
in a solution. After having briefly considered several simple online heuristics, 
we took the decision to follow a REPLAN strategy. 

This strategy consists of recomputing from scratch, at certain moments in 
time, a completely new dispatch for the current Vdp snapshot (the so-called 
offline problem). REPLAN assumes that an algorithm for this offline problem 
is at hand, which is capable of delivering solutions with certain guaranteed 
quality under real-time conditions (see [4] for more details). Fortunately, this 
is the case for the instances arising at ABAC, as we shall see in the next 
section. 

3 Auxiliary Optimization Problem 

In the following, we briefly specify the form of Vdp that is tackled by our 
algorithm. More details concerning both the model as the solution method 
can be found in [5]. 

An instance of the Vdp consists of a set of units, a set of contractors, and 
a set of events. Each unit u has a current position Ou^ a home position a 
logon time a shift end time and a set of capabilities Fu> Moreover, 
the costs related to using this unit are specified by values for costs per time 
unit for each of the following actions: driving serving and overtime 
c^. Each contractor v has a home position dy and a set of capabilities Fy. 
The costs for booking the contractor are specifled by a flxed value per service 
Each event e has a position Ug, a release time 61, a deadline 6^, a service 
time Se, and a set of required capabilities Fg. Moreover, missing the deadline 
of an event means incurring in a lateness cost equal to the delay times the 
value of a lateness coefficient 

A feasible solution of the Vdp (a dispatch) is an assignment of events to 
units and contractors capable of serving them, as well as a tour for each unit 
such that all events are assigned, the service of events does not start before 
their release times (waiting of units is allowed at no extra-costs), and all tours 
for all units start at their current positions not before their logon times and 
end at their home positions. The costs of a dispatch are the sum of all unit 
costs, contractor costs, and event costs. 

Following a common approach in the vehicle routing literature (see, e.g., 
[1] and the references therein), we state our model using binary tour variables. 
The Vdp can then be formulated as a set partitioning problem, where the 
ground set of events and units has to be partitioned using a family of subsets 
that represent unit tours (plus some special subsets to account for the alter- 
native of assigning events to contractors). The advantage of such a model lies 
in its flexibility to incorporate complicated technical and organizational side 





171 



constraints within the individual tours and the possibility to tweak the cost 
function to achieve certain desired properties of the online behaviour (e.g., 
non-linear lateness penalty). This set partitioning model may be written as 
a huge integer linear program where the 0/1 restriction matrix contains one 
row for each event and unit and one column for each tour. To solve this 
problem, we use a column generation approach. Starting from some initial 
columns produced heuristically, the main iteration of our algorithm consists 
of adding new columns and resolving the linear relaxation of the problem 
until a certain stopping criterion is met. Whenever a new integral solution is 
found we output the corresponding dispatch. 

The search for columns is done by enumeration in a depth-first-search 
branch&hound tree {search tree, for short) for each unit. To prune the search 
tree, a lower bound on the cost of a tour is used, which is based on the 
dual prices of events and units obtained from the previous solution of the 
linear problem. Our main contribution lies in what we call dynamic pricing 
control The depth and degree of the search tree, as well as the value of a 
(negative) acceptance threshold imposed on the reduced cost of new columns 
are adjusted at each iteration according to the number of columns produced 
in the previous one. This ensures that (i) the effort of finding new columns 
is small in the beginning, when the dual variables are not yet in good shape, 
(ii) the dual variables are updated often in the beginning, (iii) this update is 
fast since the number of columns in the LP is still small, (iv) we can enforce 
the output of a feasible integer solution early, and (v) the search is exact later 
in the run when the dual information is reliable. 

Our algorithm ZIBDIP turned out to be very efficient on real-world 
snapshot-instances provided by ABAC. In all tested cases, provably opti- 
mal or near-optimal (<1%) solutions were found in less than five seconds on 
state-of-the-art personal computers, even for high load situations containing 
about 200 events and 100 units. This figures ensure compliance with the real- 
time requirements of the application. The behavior of the algorithm remained 
stable for problem instances with (artificially augmented) extreme load: ZIB- 
DIP found a dispatch to attend 770 events with 200 units whose quality was 
within 12% from optimum after 5 seconds, within 5% from optimum after 15 
seconds, and within 2% after one minute. 

Besides of solving the real-time “snapshot” problems, our algorithm is 
also used for the evaluation of online strategies: by running it a-posteriori on 
jobs collected during one/two hours, we found lower bounds on the value of 
the hindsight-optimum discussed in section 2. 



4 Computational Results on Real- World-Data 



The input data for the online-tests consisted of 68 one-hour instances and 68 
two-hour instances that were extracted from accumulated whole-day datasets 
provided by ABAC. The events occurring in these instances were labeled with 





172 



time-stamps that represent the moment in time when they pop up. Using this 
information, a simulation was run on the ADOptCmd implementation of our 
algorithm to test six variants of the REPLAN strategy mentioned in section 
2. The total cost of the dispatch produced during the whole hour (resp. two 
hours) was compared to a lower bound on the hindsight optimum, which 
was obtained by solving to optimality the linear programming relaxation of 
the offline problem (i.e., the problem of finding an optimal dispatch if all 
events are known from the beginning). In all cases, the online algorithms 
were required to deliver their (partial) solutions within 15 seconds of running 
time. 

Both for the one-hour as for the two-hours instances, the relative error 
of the online solution achieved by ADOptCmd for all six alternatives is well 
below 50% on the average with very rare substantial deviations of up to 
230.71%, obtained for the setting intl20 (replan every 120 seconds) in one 
data set. There is a difference in the performances of the six different settings 
in favour of new job (replan with every occurrence of a new job) and int60 
(replan every 60 seconds). This difference is, however, small enough to ensure 
that there will be no serious performance problems in waiting say 60 seconds 
for an optimization run to finish. 

Figure 1 shows the distribution of the relative errors obtained for the best 
two settings new job and int60 on the one-hour instances. Setting new job 
achieves the best results on average (a relative error of 40.21% against 44.33% 
for int60) and produces the least deviation (22.65 against 33.64). This trend 
continues in the two-hour instances. 




(a) newjob: /x = 40.21, a = 22.65 (b) int60: ^ = 44.33, a = 33.64 



Fig. 1. Relative errors in the solutions produced for the one- hour instances. 



Another fact to be noticed is that there is no substantial degradation 
in the performance of the algorithm when switching from the one-hour to 
the two-hours instances. In fact, the relative error obtained by newjob was 
slightly better in the second case. We expect the same to hold also for larger 
intervals of time. The computation of the required lower bounds, however, 
becomes then technically unmanageable. 





173 



5 Conclusion 

We have developed a specialized column generation algorithm ZIBDIP that 
solves a real-world large scale vehicle dispatching problem with soft time win- 
dows under realtime requirements. The problem arises as a subproblem in an 
online-dispatching task that was proposed to us by the German Automobile 
Association (ABAC). A key concept behind the algorithm is the Dynamic 
Pricing Control, which can significantly speed up convergence of the column 
generation process, thereby making a method that has proven to be effective 
for large scale offline problems ready for the use in online-algorithms under re- 
altime requirements. A further advantage of the column generation approach 
is its flexibility to incorporate complicated restrictions: we are planning to 
use non-linear lateness penalties in future tests. 

Employing a-posteriori analysis on real-world problem instances provided 
by ABAC, we tested the performance of several settings of the REPLAN 
strategy (i.e., we determined a kind of experimental competitivity). The best 
results were obtained for the case when a new dispatch was computed either 
each time a new event was issued or at a fixed frequency of 60 seconds. 

Although the development of the final version of the online algorithm is 
still under way, the first results are promising: On average, the online costs 
are within 50% above a lower bound on the “hindsight” (offline) adversary — 
not too bad, in our experience, for an online algorithm. We hope, however, 
to improve on these figures by utilizing estimates of future events in the 
snapshot-dispatches. This is work in progress and leads to most interesting 
questions as to how knowledge about the future can be exploited in combi- 
natorial online optimization. 

References 

1. Jacques Desrosiers, Yvan Dumas, Maxius M. Solomon, and Francois Soumis, 
Time constrained routing and scheduling, Handbooks in Operations Resezorch 
and Management Science, vol. 8, Elsevier Science B.V., Amsterdam, 1995, 
pp. 35-139. 

2. Martin Grotschel, Sven O. Krumke, and Jorg Rambau, Online optimization of 
complex transportation systems, in Online Optimization of Large Scale Systems 
[3], pp. 714-739. 

3. Martin Grotschel, Sven O. Krumke, and Jorg Rambau (eds.). Online optimiza- 
tion of large scale systems. Springer Verlag, 2001. 

4. Martin Grotschel, Sven O. Krumke, Jorg Rambau, Thomas Winter, and Uwe 
Zimmermann, Combinatorial online optimization in real time, in Grotschel et al. 
[3], pp. 687-713. 

5. Sven O. Krumke, Jorg Rambau, and Luis M. Torres, Real-time dispatching of 
guided and unquided automobile service units with soft time windows. Preprint 
SC 01-22, Konrad- Zuse-Zentrum Berlin, Berlin, 2001. 

6. D.D. Sleator and R.E. Tarjan, Amortized efficiency of list update and paging 
rules, Communications of ACM 28 (1985), 202-208. 





Multi-Class User Equilibria under Social 
Marginal Cost Pricing 



Leonid Engelson^, Per Olov Lindbergh, and Maria Daneva^ 

^ Inregia, Box 12519, SE-10229, Stockholm, 

^ Linkoping University, Dept, of Mathematics, SE-581 83 Linkoping, 
e-mails: leeQinregia.se, polinQmai.liu.se, madanQmai.liu.se 



Abstract. In the congested cities of today, congestion pricing is a tempting al- 
ternative. With a single user class, already Beckmann et al. showed that ’’system 
optimal” traffic flows can be achieved by social marginal cost (SMC) pricing. How- 
ever, different user classes can have wildly differing time values. Hence, when in- 
troducing tolls, one should consider multi-class user equilibria, where the classes 
have different time values. With SMC pricing, Netter claims that multi-class equi- 
librium problems cannot be stated as an optimization problems. We show that, 
depending on the formulation, the multi-class SMC-pricing equilibrium problem 
(with different time values) can be stated either as an asymmetric or as a symmet- 
ric equilibrium problem. In the latter case, the corresponding optimization problem 
is in general non-convex. For this non-convex problem, we devise descent methods 
of Prank- Wolfe type. We apply the methods and study a synthetic case based on 
Sioux Falls. 



1 Overview 

Traffic in major cities has become one of the biggest problems for society. 
It has become a common standpoint among transportation planners that 
charging some kind of fees from the users of the road network is necessary. 

The classical social marginal cost pricing theory ([1], Ch. 4) states that 
if each user is charged a toll equal to the total value of time loss incurred 
on other users of the network, this will induce an equilibrium that is system 
optimal, assuming that all users have a fixed and identical time value. 

However, different traveller groups may have grossly different time values, 
in Stockholm, e.g., ranging by a factor of more than 17. In such situations, 
computed tolled equilibria need to recognize the variation of time values be- 
tween user groups, thus leading to multi-class user equilibria, (e.g., [2]). When 
the Jacobian of the multi-class link travel costs is symmetric and positive def- 
inite, such equilibria are unique and can be found by solving a corresponding 
optimization problem. 

Netter [5], however, argues that when link travel times depend on class 
specific link volumes and are different for different user classes, the user equi- 
librium problem is not generally symmetric even in toll-free networks or in 
networks with fixed tolls. 




175 



The subject of this paper is social marginal cost (SMC) pricing in net- 
works with several user classes that differ only by their value of time. We 
show in particular that the variational inequality defining the equilibrium 
can be stated in different forms, asymmetric or symmetric, hence not having 
or having a corresponding optimization formulation. We further see that this 
optimization problem in general is nonconvex. We conclude by stating an 
algorithm of Frank- Wolfe type for the SMC toll case, and applying it to the 
classical Sioux Falls network. 



2 Multi-Class User Equilibria 

Consider a road network consisting of nodes n e N and directed links a E A. 
Let P C N xN he the set of OD pairs. Assume that OD demands Qp between 
the OD pairs p G P for each user class k E K are given and that to each link 
a are associated continuously differentiable cost functions c* : 3?:^ 3?+, 

that give the cost of traversing link a for a user in class k dependent on the 
class specific volumes on the link. For the time being, is a general function, 
but it will be endowed with a special structure in the next section. 

Let Rp be the set of routes connecting OD pair p and R = Up^pRp. Let 
H = {h E K ~ ^ P,k E K} denote the set of feasible 

route fiow vectors, and let 

F = {f € £ A,k € K,h € H}, be the set of 

feasible link fiows, where 5ra is 1 if route r traverses link a, 0 otherwise. 

Definition 1. (extended Wardrop) ft G P is a multi-class equilibrium fiow if 
for any OD-pair p and user class ft, the class specific costs of routes actually 
used (i.e. having ft^ > 0), are equal and not larger than those of any unused 
routes. 

Similarly to the single user class case (see [6], Thm 3.14), the equilibrium 
condition can be written as a variational inequality in the set of link fiows 

{c{f)J-f)>0 yfeF ( 1 ) 



where c = (c^)aeA,keK- 

Variational inequality problems (VIP:s) are usually solved by reduction 
to optimization problems (or series of such). When the link specific cost 
functions are symmetric, i.e., with fa = {fa)keK, when 



dc'^ifa) 

dfa 



d^aifa) 

df^ 



'iae A, k, I e K, 



( 2 ) 



then c(/) = VI{f), the gradient of some differentiable function I : 

3?. In this case, all local minima of I restricted to F (and maybe some other 





176 



points) are solutions to the VIP (1). If, in addition, the function I is convex, 
then the VIP (1) is equivalent to the optimization problem: 

min/(/) s.t.feF. (3) 

If moreover, I is strictly convex, uniqueness of the solution to (3) and hence 
of the equilibrium is guaranteed (see, e.g., [4]). 

3 Equilibria under Social Marginal Cost Pricing 

For space reasons this section is quite terse; for more details, see [3]. In the 
remainder of the paper, it is assumed that the drivers’ perceived link travel 
cost consists of two components: the toll Pa and the travel time ta^ which is a 
twice differentiable function of the total link volume = YlkeK fa - Using 
class specific time values Vk > 0, the perceived travel cost can be expressed 
either in time or in monetary terms. Thus we define generalized time and 
generalized cost of link a for class fc, respectively, as 

i^aifa)=ta{fi‘>')+Pa/Vk, (4a) 

clifa)=Vkta{fn+Pa. (4b) 

Under SMC pricing, the users have to pay for the delays they incur on 
other users and one is interested in the traffic volumes that are established in 
the network and the corresponding toll values. The link toll then is the sum 
of all delay values for the users of the link caused by a marginal user, i.e., 

Pa=Pa{f)^Uff)Y,^<^f'- ( 5 ) 

keK 

The SMC equilibrium problem can now be formulated as the VIP (1) with 
link cost functions defined by inserting (5) into (4a) or (4b). However, using 
the costs cj = t^ the symmetry condition (2) is not satisfied, precluding an 
optimization formulation. On the other hand, using the costs cj = cj, we get 

= VkUfn + i':(/r) E 

neK 

which somewhat astonishingly satisfies (2). Integration yields 

a^A k^K 

with the property that V/(/) = c(/). Note that (6) is the overall travel cost 
in the network as perceived by society under assumption that travel time 
of individual travellers is perceived as valuable as the travellers perceive it 
themselves. 

In general, /(/) is not convex and multiple SMC equilibria corresponding 
to different values of I can exist (see example below). 





177 



4 A Two-Link Example 

Consider a network consisting of a single OD pair p connected by two parallel 
links a and b with identical travel time functions ta{u) = th{u) = u. There 
are two traveller classes with time values v\ = \ and U 2 = 5, and with travel 

demand - 100. The feasible set is F = {/ = UlJhfhfb) ^ ' 

/ 1 -f / 1 ^ y 2 _ Without tolls, there is a continuous set of user 

equilibria {/}, namely fl — fl — \ ft — f I — 100 - A for 0 < A < 100. Note 
that travel time and total volume on each link is constant across {/}. 
Introduction of marginal congestion pricing leads to the VIP 

(2/a + - fl) + (6/i + - fa)+ (j) 

iVl + - n) + + WiM - ft) > 0, Vp G F 

Considering different cases, one can see that the equilibria are 

/(i) - (0,80,100,20), /(2) (100,20,0,80) and - (50,50,50,50) with 

corresponding tolls = (400, 200), = (200,400) and p^^^ = (300,300), 

respectively. Whereas and are local (and global) minima of the 

overall network cost function /(/) = (/a +/a)(/i +5/^) -f {ft ^ 

on F, with /(/^^^) = /(/^^^) = 56000, is a saddle point with /(/^^^) = 

60000 (see Figure 1). When the tolls p^^^ or are enforced, the only existing 




Fig. 1. Equilibria (•) and level curves and negative gradients of the function I. 



user equilibria are and respectively. Implementation of the tolls 
p^^^ does not affect the route choice, whence there is the same set of user 
equilibria {/} as in the toll free situation. Thus the equilibrium flow pattern 
with fixed equilibrium tolls p^^^ need not coincide with However, these 
flow patterns are equivalent both from the individual and the societal points 
of view, since total flow and travel time along each link, as well as the total 
perceived travel cost at any of {/} is the same as at f^^\ 





178 



5 A Prank- Wolfe Algorithm for the SMC Equilibrium 

It follows from the results of sections 2 and 3 that the SMC equilibria can be 
determined by solving the optimisation problem (3) with the objective I of 
(6). To this end one can use (an adaption of) the Prank- Wolfe method. 

Linearizing the objective, (3) decomposes into a set of shortest path prob- 
lems; one for each user class and OD-pair. Assigning all demand to these 
shortest paths gives an extreme point feasible solution to (3). Performing a 
line search from the current solution towards this extreme point solution, one 
arrives at a better point, unless the current point is already an equilibrium. 
The process can be initialized in, e.g., the free flow extreme point solution. 

6 Some Experimental Results 

6.1 The Two Link Network 

The algorithm has first been applied to the two link network in the example 
above, although with a different but qualitatively equivalent cost function. 
Due to the symmetric network structure, the algorithm, when started with 
the free flow conditions, quickly reaches the saddle point equilibrium and gets 
stuck there. When the algorithm is started from another feasible link flow, it 
reaches one of the local optima, although experiencing a lot of zig-zagging. 



6.2 Sioux Falls Network 

We have also applied the algorithm to the classical Sioux Falls network. 
For each OD-pair, we subdivided demand into three user classes (work, 
business and other trips) with class fractions equal to the Stockholm case 
(.754, .036, .210). We also set the time values to their Stockholm values, 
V = (.98,3.30, .19), but normalized to average time value equal to 1. Note, 
that since the problem is nonconvex, we do not get an underestimate of the 
optimal value, when we solve the linearized problem, but it is reasonable to 
assume that we get a local lower bound in the limit. In order to get an ac- 
curate estimate of one local optimal value, we performed as many as 10000 
iterations (which would give a relative error of 10“^ in the single class case). 

Starting in the free flow solution, we get an objective value of 71.09, 
compared to the single class system optimum 71.94 and the systems cost of 
the single class user equilibrium at 74.80. The iteration history of the relative 
error (/ (/^^) )-LBD) / LBD (where LBD is the optimal value of the linearized 
problem at iteration i = 10000) is displayed in a log-log diagram in figure 
2a, together with the iteration history for the single class user optimum case 
(with LBD equal to the full precision optimal value), given for reference. As 
can be seen, apart from the first 10 iterations the progress in the multi-class 
case is comparable to that of the single class case. 





179 



In figure 2b we display iteration histories for the accurate free flow run 
together with three runs starting at random extreme points (with LBD from 
the free flow run). The behavior of the random runs are in line with the 
accurate one, making it probable that there is only one local optimum. 





Fig. 2. Iteration histories of SMC calculations a. (left) multiclass compared to single 
class, b. (right) different initial flow patterns. 



References 

1. Beckmann, M., McGuire, C., and Winsten, C. (1956) Studies in the economics 
of transportation. Yale University Press, New Haven. 

2. Dafermos, S.(1973) Toll patterns for multiclass- user transportation networks. 
Transportation Sci. 7, 211-223. 

3. Engelson, L., Lindberg, P.O., and Daneva, M. (2002) Congestion Pricing of 
Road Network Users with Different Time Values, Technical Report LiTH-MAT- 
R-2002-10, Linkoping University, forthcoming. 

4. Nagurney, A. (1993) Network economics: a variational inequality approach. 
Kluwer Academic Publishers Group, Dordrecht. 

5. Netter, M. (1972) Equilibrium and marginal cost pricing on a road network with 
several traffic flow types. Proceedings of the 5th International Symposium on 
the Theory of Traffic flow and Transportation, G.F. Newell, Elsevier, 155-163. 

6. Patriksson, M. (1994) The Traffic Assignment Problem - Models and Methods. 
VSP, Utrecht, The Netherlands. 





School Bus Routing and Scheduling Problem 



Michela Spada, Michel Bierlaire, and Thomas M. Liebling 

Ecole Polytechnique FMerale de Lausanne, Institute of Mathematics, Switzerland. 



Abstract. We consider the school bus routing and scheduling problem, where 
transportation demand is known and bus scheduling can be planned in advance. 
We first propose a modeling framework which aims to optimize a level of service 
for a given number of buses. Then, we describe a procedure building first a feasible 
solution, and subsequently improving it, using a heuristic. After the performance 
analysis of three types of heuristics on real and synthetic data, we recommend the 
use of simulated annealing exploring infeasible solutions, which performs slightly 
better than all others. More importantly, we find that the performance of all heuris- 
tics is not globally affected by the choice of the parameters. This is important from 
a practitioner viewpoint, as the fine tuning of algorithm parameters is not critical 
for its performance. 



1 Introduction 

Busing systems are common in countries where school assignment is imposed 
by home location, as in Switzerland, France, and Italy for instance. We pro- 
pose a modeling framework, where the concept of level of service is captured 
by two different objective functions (Section 3). We propose and analyze au- 
tomatic procedures to build a solution, all of which start by generating an 
initial feasible solution (see Section 4). In Section 5, we then propose three 
automatic procedures to improve the quality of the solution: a tabu search 
heuristic, and two variants of simulated annealing. We have tested several 
instances of each proposed heuristic on a set of 30 problems. The results, 
presented in Section 6, suggest that the simulated annealing heuristic explor- 
ing infeasible solutions is the most efficient out of the three, irrespectively of 
the choice of parameters. 

2 Literature review 

Various approaches are described in the literature for the school bus routing 
and scheduling problem. They differ in the way of the problem decomposi- 
tion, in the modeling assumptions and in the solution algorithms. We can 
roughly distinguish two types of approaches. The first (see [1], [2], [3], [4], 
and [5]) consists in decomposing the problem and generating for each school 
simple routes from children’s home. Those routes are then merged in order 
to optimize the objective function. In this approach no mixed loads are al- 
lowed, i.e. children from different schools are never carried in the same bus 




181 



at the same time. The second one [6] solves the whole problem in one stage 
allowing mixed loads. Prom the modeling viewpoint, most authors aim at 
minimizing the costs, that is the number of buses or a combination of the 
number of buses and the total travel time. In this paper, we prefer the first 
approach, but allowing mixed loads, and we explicitly optimize the level of 
service provided by the bus operator. 



3 Problem formulation 

The transportation network is given by G = (V^£,£) where V is a set of 
nodes, £ a set of edges and £ a weight function associating a non-negative 
weight to each edge (typically, the travel time necessary to traverse the edge). 
We denote hy = {hi, . . . .hn} £ V the set of nodes, called homes, where 
children get on the bus. Similarly, we denote by 5 = {si , . . . , 55} C V the 
set of nodes where schools are located. For each school Si G S, we denote 
by H{si) C Ti the set of home nodes of children attending school Si. In the 
model, a child c may represent in reality a group of several children with the 
same characteristics, that is the same home location and the same school. 
This simplifies the formulation with a moderate loss of flexibility. There are 
B buses to transport the children. Bus b has a given capacity, measured as the 
same unit as the children’s space requirement, and is typically a number of 
seats. Note that some buses may load more children than the actual number 
of seats, for example when the bus has benches rather than individual seats 
and young children are carried. 

A solution to the school bus routing and scheduling problem consists in 
specifying for each bus a tour, that is a list of stops, and for each stop, a list 
of children to pick-up and/or drop. Consequently, for each bus b and each 
stop, we impose that children get on bus b only at home, and get off only at 
school. The bus schedule depend on school starting time. 

The number of buses being limited, each child may experience a delay 
due to the fact that the bus must pick-up other children on its way to the 
school. For each child, we define the delay as the difference between its actual 
journey time and the shortest possible time between home and school. Also, 
as buses may have to perform several tours, some children will be dropped at 
school too early. The time spent by the child waiting at school for the class 
to begin is referred to as the waiting time. The sum of the delay and the 
waiting time is called the time loss of the child c denoted by /(c). We will 
consider here two related performance measures for the bus tour schedules, 
leading to two different objective functions to be minimized. The first one is 
total time loss summed over all children or equivalently, the mean time loss 
over all children, to be minimized over all tour schedules 

cec 





182 



The second measure is the maximum time loss over all children, to be mini- 
mized over all tour schedules. It aims at balancing time losses. 

7*2 = max /(c). 

cGC 

The following constraints must be verified in the problem. 

• Each child should arrive at school on time. 

• Each child is carried by exactly one bus and steps out of the same bus it 
boarded. 

• No bus can carry more children than its capacity allows for. 

• For each tour, the schedule must be such that the bus can complete the 
trip within time, that is the travel time between two stops can not be 
less than the minimum travel distance between them. 

This optimization problem can be formulated as a linear integer program, 
a program of very large scale, even for small problems. 

4 Construction of a feasible solution 

The heuristic idea to construct a feasible solution, is to first ignore the number 
of available buses B, and use as many of them as needed to verify all other 
constraints. We denote by Q the capacity of these virtual buses such that 
Q is the minimal capacity of bus fieet. Schools are considered in increasing 
order of their starting times. For each school s, the associated home nodes 
in B(s) are sorted in decreasing order of their distance from s. A first chain 
is constructed between the two first homes of His) sorted. Then homes His) 
are successively inserted in that chain, respecting the minimal chain length 
increase principle and bus capacity Q. Each virtual bus b serves a unique 
school s{b). Finally, for each virtual bus 6, we insert the node corresponding 
to school s{b) in the chain. As both chain traversals are possible, we consider 
inserting its school at either one of its extremities, preferring that resulting 
in the shortest chain. If needed, the chain is reversed such that the school 
appears last. Finally, we define the tour associated with the constructed chain, 
and we impose that the bus is scheduled to arrive exactly when class starts, 
the bus uses the shortest route, only children going to school s{b) step in the 
bus, there is no pick-up at school, there is no drop off at home, all children 
that were picked-up are dropped off at school. 

If the solution obtained by applying this procedure is composed by more 
than B buses, tours are merged pairwise the necessary number of times to 
obtain exactly B buses. Tours are merged in such a way to minimize the 
modifications of actual schedule. In this operation, we prefer mergers that do 
not modify the schedule, i.e. that generate a minimal waiting time for the 
bus driver who has to perform successively two tours. If this is not possible, 
we prefer mergers that advance the departure time of the first tour the least 
possible. 





183 



5 Algorithmic improvement 

In order to improve an initial solution found, e.g. by the above procedure, we 
developed and compared three heuristic procedures that try to achieve a bet- 
ter value of the objective function {T\ or T 2 ) while keeping feasibility. These 
three local search heuristics are; a variant of simulated annealing (feasible 
and infeasible version) and tabu search. 

We have implemented a variant of simulated annealing (see [7] and [8]), 
where the temperature reaching a very low value is reheated to its initial 
value. We consider two neighborhoods: feasible neighborhood Nf generates 
only feasible solutions, while infeasible neighborhood Mi enlarges the set of 
candidate solutions and may generate solutions violating the bus capacity 
constraint. In both neighborhoods solutions are modified by removing one 
child from a bus and trying to place it in an other one. The feasible neigh- 
borhood Mf solution consists to insert child c in bus b' resulting in the largest 
improvement (or the least deterioration) of the solution, in sense of the objec- 
tive function chosen !Fi or The other proposed neighborhood Mi modifies 
a solution in a similar way as A//, except that capacity satisfaction is not en- 
forced when child c is inserted in bus 6'. Therefore, Mi may create infeasible 
solutions. In order to restore feasibility, we augment the objective function 
value T\ or T 2 by a penalty depending on the capacity violation and on the 
number of consecutive iterations visiting infeasible solutions. 

We have also implemented a standard tabu search(see [9] and [10]) con- 
sidering only the feasible neighborhood A//. A complete description of these 
heuristics can be found in [11]. The performance of these methods is analyzed 
and discussed in Section 6. 

6 Case studies 

In order to validate the heuristic approaches, these were tested on both real 
world and synthetic data sets. 

The real data comes from two Swiss towns, Savigny and Forel. The net- 
work contains 34 nodes, including 12 schools: 4 kindergartens starting at 8.35, 
6 primary schools starting at 8.15 and 2 high schools starting at 7.40. For 
the school year 1997-1998, four buses were used to transport 274 children. 
The capacity of each bus is 210 units, and each high school child takes up 10 
units in the bus, while kindergarten and primary school children take up 7 
units. 

The synthetic data has been generated in order to analyze the relative effi- 
ciency of the methods as a function of the network size. For a given number of 
nodes randomly distributed in a square, the network is obtained from a Delau- 
nay triangulation, from which links are removed such as to obtain a realistic 
degree for each node, while maintaining the connectivity. The distributions 
of homes, schools and trips have been chosen to resemble the proportions in 
the real data set. 





184 



The experimental design is based on four variables: the network, the num- 
ber of buses, the objective function and the heuristic. In addition to the Sav- 
igny and Forel network, we have generated four artificial networks with 10, 
20, 30 and 50 nodes, respectively. For each of the five networks, we have 
run several scenarios for the number of buses (three on average), and the 
two objective functions T\ and JF 2 , to obtain a total of thirty problems. We 
have considered three types of heuristics, that is tabu search and simulated 
annealing with feasible neighborhood and infeasible neighborhood. In each 
case, simulated annealing has been run using four different initial tempera- 
tures, and five different seeds for the random number generator. The tabu 
search algorithm has been run using two different neighborhood sizes, and 
ten different seeds for the random number generator. Thus, twenty instances 
of each type of heuristic, totaling 1800 runs were carried out on a Pentium 
III 499MHz. On this machine an instance with 50 nodes and 9 buses took 
about 10 hours to complete 50’000 solution evaluations, that is about 0.7 sec. 
per evaluation. We summarize now the main observations from these runs. 

Clearly, the quality of the solution depends on the specific heuristics used 
and the choice of its parameters. In order to compare the overall quality of the 
heuristics, we adopt a variant of the performance profiles analysis proposed 
by Dolan and More [12]. 

If /p,a,i is the performance index of instance i of algorithm a solving 
problem p, then the performance ratio is defined by rp^a.i = min f^tfl a iV 
any given threshold tt, the overall performance of algorithm a is given by 
Pa{7r) = ^a(Tr) where Up is the number of problems considered, n(a) is 
the number of instances of algorithm a which have been run, and is 

the number of problems and instances of algorithm a for which rp,a,i < tt. 
We refer the reader to [12] for more details about the benchmarking method. 

The performance indices we have selected are the relative improvements 
of the objective function (that is the ratio between the final and initial values 
of the objective function) after a given number of candidate evaluations (the 
computation budget). Interestingly, the profiles are qualitatively similar for 
different values of the computation budget. These results are encouraging, 
as they indicate that the proposed approach is rather robust and does not 
depend too heavily on the tuning of the heuristics parameters. We present 
here the results for an index based on a fixed budget of solution evaluations 
depending on the problem size (100|V| evaluations). 

We compare tabu search, and the two versions of simulated annealing for 
the problem with objective function !Fi and ^ 2 . In Figure 1, the plots in the 
bottom are just zooms of the top plots, in order to emphasize the profile for 
small values of tt. For the sake of clarity, the plots for and have been 
separated, although the performance profile analysis has been performed over 
all methods. 

The most noticeable performance is achieved by the infeasible version 
of simulated annealing when objective function T\ is considered (see. Fig- 





185 





(a) 



( b )^2 




(c) (d) 

Fig. 1. Performance profiles for budget = 100 |V| 



ure 1(c)). It is interesting to note the slope of the profiles for each objective 
function (compare, for instance, Figures 1(c) and 1(d)). The profile is steeper 
with objective function , illustrating a more robust behavior of the heuris- 
tics when is preferred over ^2- This is due to the fact that the same value 
of the objective T2 may be achieved for a large number of different solutions, 
as only the largest time loss is considered. Therefore, the heuristics are unable 
to discriminate among a large class of solutions with the same objective, and 
are more likely to cycle randomly among them. 



All heuristics exhibit the same type of behavior: a significant improve- 
ment of the objective function in the early iterations, and a slower reduction 
rate in subsequent iterations. This is illustrated by Figure 2 where the x-axis 
represents the number of evaluations of candidate solutions, and the y-axis is 
the ratio of the objective function for the solution computed by each heuris- 
tic and the objective function of the initial solution. Results for Ti/C are 
represented on the left figure, and results for T2 on the right. 





186 




Fig. 2. Heuristic improvements with |V| = 50 and B = 9 



Further results can be found in [11]. 

References 

1. Bodin, L., Berman, L. (1979) Routing and Scheduling of School Buses by Com- 
puter. Transportation Science 13 , 113-129. 

2. Angel, R., Caudle, W., Noonan, R., and Whinston, A. (1972) Computer-assisted 
School Bus Scheduling. Management Science B 18, 279-288. 

3. Bennett, B., Gazis, D. (1972) School Bus Routing by Computer. Transportation 
Research 6, 317-326. 

4. Desrosiers, J., Ferland, J., Rousseau, J.-M., Lapalme, G., Chapleau, L. (1981) An 
Overview of a School Busing System. In: N. Jaiswal (Ed.), Scientific Management 
of Transport Systems, Vol. IX, International Conference on Transportation, New 
Delhi, November 26-28, 1980, 235-243. 

5. Gavish, B., Shlifer, E. (1979) An Approach for Solving a Class of Transportation 
Scheduling Problems. EJOR 3(2), 122-134. 

6. Braca, J., Bramel, J., Poser, B., Simchi-Levi, D. (1994) A Computerized Ap- 
proach to the New York City School Bus Routing Problem. Technical report. 
Graduate School of Business, Columbia University, NY. 

7. Kirkpatrick, S., Gelatt, C. D. J., Vecchi, M. P. (1983) Optimization by Simulated 
Annealing. Science 220, 671-680. 

8. Rossier, Y., Troyon, M., Liebling, T. M. (1986) Probabilistic Exchange Algo- 
rithms and Euclidean Traveling Salesman Problems. OR Spektrum 8(3), 151- 
164. 

9. Glover, F. (1977) Heuristic for Integer Programming Using Surrogate Con- 
straints. Decision Sciences 8, 156-166. 

10. Hansen, R, Jaumard, B. (1987) Algorithms for the Maximum Satisfiability 
Problem. Technical report, Rutgers University. 

11. Spada, M., Bierlaire, M., Liebling, T. M. (2002) Decision-aid Methodology for 
the School Bus Routing and Scheduling Problem. Report # RO-20020819, Ecole 
Polytechnique Federale de Lausanne. 

12. Dolan, E. D., More, J. J. (2002) Benchmarking Optimization Software with 
Performance Profiles. Mathematical Programming, Serie A 91 , 201-213. 





Covering Population Areas by Railway Stops 



Anita Schobel^ and Michael Schroder^ 

^ Universitat Kaiserslautern, Postfach 3049, D-67653 Kaiserslautern, Germany 
^ Fraunhofer Institut fur Techno- und Wirtschaftsmathematik, 
Gottlieb-Daimler-Strasse 49, D-67663 Kaiserslautern, Germany 



Abstract. We study the installation of new stops (or stations) along existing links 
in public transportation, e.g. railway systems. This improves the coverage level of 
the system, i.e., the number of people living near some stop. On the other hand, 
additional cost is incurred and travel times tend to increase. 

We model this as a network location problem, where the network corresponds to the 
transportation links. Our main contribution is to model the population distribution 
by a system of compact subsets in the plane, the population areas. The goal is to 
cover all these areas with as few stops as possible along the railway tracks. We 
present an efficient algorithm for finding an optimal solution in the case of a single 
edge and show its applicability to real world problems. 



1 Introduction 

We define the coverage level of a public transportation system as the per- 
centage of the population that lives within a given distance from any of the 
stations. A relatively cheap way to improve the coverage level is to establish 
new stops along existing tracks. The cost incurred are the fixed and variable 
costs of the stop and the cost of the additional running time of the trains 
and their crews. Furthermore the travel time of passengers increases if trains 
stop more often. Therefore we want to achieve a certain coverage level with 
as few stops as possible. 

In the literature, location of stops in transportation systems has been mainly 
considered within a discrete setting, i.e., given a finite candidate set of pos- 
sible stops, which of them should be chosen to get a certain coverage level? 
Such discrete stop location problems can be reduced to set covering problems 
and have been studied in [5,6], and, very recently, in [7], with an application 
in Brisbane, Australia. Another discrete stop location model has been de- 
veloped by Laporte et al. [4]. They investigate which candidate stops along 
one given line in Sevilla should be opened, taking into account constraints 
on the interstation space. On the other hand, in the continuous stop location 
problem there is no discrete candidate set given, but the stops may be located 
anywhere along the given track system. This has first been considered heuris- 
tically in [2]. Theoretical results about the continuous location of stops can 
be found in [8], and an algorithm for finding such a set of stops is suggested 
in [3]. In these papers, the settlements (or population areas) are represented 
by one point each. This assumption is not very practical, but simplifies the 




188 



analysis drastically, since only points need to be covered. E.g., defining cov- 
erage as the number of customers living closer than 2 km to their closest 
station, and representing each settlement as a single point yields a coverage 
level for Germany of 61.3%. In contrast, if we consider the percentage of the 
covered area, weighted by the population living there, the coverage level is 
only 52.4%. So it makes sense to take into consideration the whole population 
sets instead of just one point for each of them. 

For our problem, we need the following: 

• Let P be a compact set describing the union of the population areas to 
be covered. 

• Let the transportation network be given as a planar connected graph 
G = {V, E) with straight-line embedding in the plane. Then 

T = UeeB^ = {x : X e e for some e e E} CJR^ 

denotes the set of points along the track system, where we assume that 
each edge e E E (representing a single track) is given by a line segment 
in the plane. We want to locate a set of stops S CT. 

We now define what we mean by covering P. To this end, let d be a norm- 
distance, e.g., the Euclidean distance, or the rectangular distance. 

Definition 1 cover(5) = {x E IR^ : d(x, s) < r for some s E 5}, where 
r > 0 is given, and S C T. Moreover, we say that P is covered by S, if 
P C cover(5). 

Complete Cover (CC) can now be stated as: 

Find a minimum- cardinality set S* CT covering P. 

Some properties should be mentioned first. 

Lemma 1 1. (CC) is NP-hard 

2. (CC) is feasible if and only if P C cover (T) 

3. ( CC) may be feasible but not finitely feasible. 

Proof: Part 1 follows from the NP-hardness of (CSLP), which is a special 
case of (CC), in which P is a finite set of points, see [8], part 2 is clear, and 
part 3 is shown by the example of Figure 1. □ 

For the following we assume that (CC) is feasible, i.e., 

P C cover(T). 

In the remainder of this paper we develop an algorithm for solving (CC) on 
a straight line segment. We then show that this approach can be quite well 
applied in practice. 





189 




Fig. 1. Example for the Euclidian distance: Infinitely many points of T are neces- 
sary to cover P. 

2 The Algorithm 

We focus on (CC) in the case that T consists of one single edge e with 
endpoints vi and U 2 - We will develop an algorithm that constructs a finite 
optimal solution if it exists. 

We first need some notation. Let < denote the natural ordering along the 
edge e induced by v\ <V 2 - Then, for xi < X 2 let 

[xi,X 2 ] :={y ee:xi <y < X 2 }. 

For each x G P let us define 

T{x) {y ee: d{y,x) < r} 

as the part of edge e that can be used to cover x, i.e., we know that at least one 
point of T{x) has to be chosen as a new stop. Since T{x) = er\{y : d{y, x) < r} 
is a convex set contained in e, we can find two points l{x),r{x) such that 
T{x) — [l{x),r{x)]. Note that for y E e,x £ P we know that 

X e coYev{y) y G T{x) l{x) <y< r{x). (1) 

Definition 2 Given some set Q CP, the (left) fixturing point of Q is 

L{Q) := inf{r(x) : x £ Q} £ e 

We are now in the position to present our algorithm for solving (CC). 

Algorithm for solving (CC) on a single edge 

Input: T = e, P C cover(T) 

Output: Opt 

Step 1. Set Opt 0, P := P. 

Step 2. Opt Opt U {L(P)}, P := P \ cover(L(P)). 

Step 3. If P = 0, STOP, otherwise goto Step 2. 



Note that the algorithm needs not to terminate finitely. But, fortunately, in 
Theorem 1 we can show that this only happens in the case that no finite opti- 
mal solution exists. Moreover, we will show the correctness of the algorithm, 
i.e., that in case of a finite termination we obtain the optimal solution. 





190 



For the following, let yi denote the fixturing point found in iteration i and 
let P* = P \ cover{ 2 /i, . . . , yi-i} be the set P at the beginning of iteration i. 
We collect some simple properties in lemma 2 and prove the correctness of 
the algorithm in the following theorem. 

Lemma 2 1. yi< i=l,2 ,. . . 

2. If the algorithm terminates then yi < yi^i for i=l,2,. . . 

3. If the algorithm terminates then Opt is a feasible solution of (CC). 

Theorem 1 If there exists a finite solution of ( CC) then the algorithm ter- 
minates with an optimal solution Opt. 

Proof; Let Opt = { 2 / 1 , 2/25 •• •} be the solution found by the algorithm. Ac- 
cording to part 1 of Lemma 2 we know that yi < 1/2 < — By assump- 
tion of the theorem there exists a finite solution Y* = {2/i?2/25 • • • ?2/s} with 
2/1 < . . . < 2 / 5 - Since for all finite Opt we already know the feasibility (part 3 
of Lemma 2) our goal is to show that |Opt| < 5. To this end, we first prove 
that 

covei{yl : k < i}nP C coyev{yk : k <i}C\ P. 

i = l: 0 = 0. 

i z -f 1: From the induction hypothesis we know that 

coyev{yl : k < i}nP C coyei{yk : fe < z} D P, 

yielding P* C P \ coyer {y I : k <i}. 

Claim 1: yt > z/*. 

Suppose y* > z/*. Since yi = L{P^) = inf{r(a;) : x G P^} and y* > yi 
there exists some x e P^ with r{x) < y*. According to (1) this means, 
X is not covered by yl if k > z. Since x E P^ yields that x ^ cover{z/^ : 
A; < z} it is also not covered by z/j^ if fc < z, a contradiction. 

Claim 2: cover(z/*) fl P* C cover(z/i). 

Let X G cover(z/*)nPL Since x G cover (z/*) we get l{x) < y* and using 
Claim 1, l{x) < yi. On the other hand, x G P* and the definition of 
yi = L{P^) yields z/j < r(x). Together, l{x) < yi < r{x) such that 
yi G T{x) meaning that x G cover(z/i) (see (1)). 

The induction hypothesis together with Claim 2 shows the result. 

Finally, since Y* is a finite solution we get that P C cover £ 5} and 
hence P C cover{z/i : z < S'}, where S' = min{|Opt|,S) < S. This means 
that at the end of iteration S' the set 

P = P^ = p^'-^ \ cover{z/ 5 /} = P\ cover{z/i, . . . , z/s'} = 0 

and the algorithm terminates with a finite solution |Optl = S' < S. □ 





191 



The algorithm can be transferred to arbitrary graphs if we know in advance 
that no point x £ P can be covered by two stations belonging to different 
edges of G. This is formalized below. 

Definition 3 For 61^62 £ E let C ( 61 , 62 ) — Pri cover(ei) fl cover(c 2 ) be the 
conflict zone of edge e\ and edge 62 . 

If the conflict zones for all pairs of edges are empty, then (CC) for the whole 
graph G decomposes into \E\ independent subproblems, one for each edge, 
and can be solved by our algorithm. In the next section we justify that the 
total population living in these conflict zones is rather small. A weaker def- 
inition of conflict zones which assures that the algorithm can be applied on 
trees, even if (CC) is not decomposable, is currently under research. 

3 Application to the German Railway Network 

To study the practical relevance of the algorithm we use data from Germany. 
The towns and cities are given as a set of 30494 population areas of which 
we know the outlines (given as polygons) and the number of inhabitants. 
The polygons are not identical with the borders of communities and also do 
not form a partition of Germany. They represent the population distribution 
better than community borders since open space is excluded. We assume 
uniform population distribution inside each polygon. 

The network of the German railway DB currently has 6828 stops. For the 
Euclidian distance with r = 2 km their coverage level is 52.4%. However 
65.2% of the population live within 2 km distance from the network (nodes 
and edges), i.e. the coverage level could be increased by almost 13%. 

We define P to be the populated area within (2 - e) km Euclidian distance 
from the network (to guarantee a finite solution, cf. Figure 1) that is not 
already covered by existing stops. Finding a minimal number of new stops 
to cover P with our algorithm requires that we can apply it to every edge 
separately. As explained above, this would be possible if all conflict zones 
were empty. For the data of Germany this requirement is clearly not satisfied 
(Figure 2). 

This does not mean that the algorithm is useless for practical purposes. If 
we exclude the population in the conflict zones from the covering problem 
we can still apply the algorithm. To And out how much this would affect the 
problem we computed all conflict zones. We found that their union contains 
1.5% of the total population. Consequently for more than 80% of the possible 
increase in the coverage level a minimal set of new stops can be calculated 
with our algorithm. 

If the conflict zones are not excluded from the covering problem, our algo- 
rithm, applied to every edge, can be used as a heuristic to solve problem 
(CC). We expect it to produce near-optimal solutions. This will be studied 
in further research. 





192 



/ r r 

cV ..0 

^ V ' V'"'^ '■ i/. 

,.. ^ y'"'\ x-^&^findorf ; 

\ ^ A /■ \ A 'i V • 

u ^chen_West ' v^v v-^/ 

r X "a . \ / 



S“ 






\\ 



' ^ X 

''x*^ \ 



^ Aacheprf^oth0^_£de-'' 

Aachen_Hbfj ; 

f ^ 

V / 




Fig. 2. Two edges with nonempty conflict zone at the city of Aachen. Shown are 
the union of circles with radius 2 km around stations (dotted), the area covered by 
the edges (dashed) and the small area containing the conflict zone (bold). 



References 

1. M. J. Demetsky, M. Asce, and B. B.-M. Lin (1982), “Bus stop location and 
design”. Transportation Engineering Journal^ 108, 313-327. 

2. H.W. Hamacher, A. Liebers, A. Schobel, D. Wagner, and F. Wagner (2001), 
“Locating new stops in a railway network”. Electronic Notes in Theoretical 
Computer Science 50:1 

3. E. Kranakis, P. Penna, K. Schlude, D.S. Taylor, and P. Widmayer (2002), 
“Improving Customer Proximity to Railway Stations” Technical report^ ETH 
Zurich. 

4. G. Laporte, J.A. Mesa, and F.A. Ortega (2002), “Locating stations on rapid 
transit lines”. Computers and Operations Research 29, 741-759. 

5. A. Murray, R. Davis, R.J. Stimson, and L. Ferreira (1998), “Public Transporta- 
tion Access”, Transportation Research D, 3:5, 319-328. 

6. A. Murray (2001), “Strategic analysis of public transport coverage”, Socio- 
Economic Planning Sciences^ 35, 175-188. 

7. A. Murray (2002), “A Coverage Models for Improving Public Transit System 
Accessibility and Expanding Access” , Technical Report Ohio State University. 

8. A. Schobel, H.W. Hamacher, A. Liebers, and D. Wagner (2002), “The con- 
tinuous stop location problem in public transportation”. Technical Report 
Wirtschaftsmathematik 81, University of Kaiserslautern. 





Innovative Losungen im bimodalen Transport 
StraBe / BinnenwasserstraBe 



Joachim R. Daduna; Johannes Schroter 

FHW Berlin; Badensche StraCe 50 - 51, D - 10825 Berlin 
FW Grundt Logistik & Spedition; StraBe 22 Nr. 2 -10, D - 13509 Berlin 



1 Guterverkehrsentwicklung und Kombinierter Verkehr 

Im Bereich des Guterverkehrs weisen die prognostizierten Entwicklungen fiir 
die kommenden Jahre einen deutlichen Anstieg aus, wobei die Zuwachse sich im 
wesentlichen auf den StraBengiiterverkehr beziehen. Die Ursachen hierfur sind 
sehr vielschichtig, u.a. bedingt durch stmkturelle Veranderungen im Giiterauf- 
kommen und dem Steigen der Transportnachfrage aufgrund einer zunehmenden 
Intemationalisierung in der Fertigung und im Handel (s. u.a. Daduna 2001). Da 
ein Anwachsen des StraBenguterverkehrs mit den verkehrspolitischen Zielvorstel- 
lungen in keiner Weise zum Einklang zu bringen ist, u.a. aus okologischen Uber- 
legungen aber auch aufgrund der weitgehend fehlenden Moglichkeiten fur Kapazi- 
tatserweiterungen im StraBennetz, wird nach Auswegen aus dieser Situation 
gesucht. Regulierende MaBnahmen staatlicher Institutionen zur Reduzierung der 
(StraBen-)Guterverkehre bzw. zur Beeinflussung von Verkehrssystementschei- 
dungen erweisen sich in der Regel als wirtschaftspolitisch kontraproduktiv, auch 
mit Blick auf die Bedeutung der Mobilitat als eine der wesentlichen Determinan- 
ten fur ein nachhaltiges Wirtschaftswachstum. Geeignete Losungen miissen daher 
mit Hilfe anderer Ansatze gefunden werden, wobei neben den Informations- und 
Kommunikations(IuK)technologien (s. u.a. Daduna / VoB 2000 bzw. Daduna 
2001) die Bildung bi- bzw. multimodaler Transportketten im Kombinierten La- 
dungsverkehr (KLV) (s. u.a. Seidelmann 1997) im Vordergrund stehen. 

Die luK-Technologien bilden eine Grundlage, um Verkehrsinfrastrukturen effi- 
zienter zu nutzen, d.h., es werden in erster Linie die operativen Ablaufe verbes- 
sert, z. B. innerhalb eines vorhandenen (StraBen-)Netzes, ohne allerdings mengen- 
m^ige Verlagerungseffekte und damit eine Veranderung des Modal splits im 
Giiterverkehr zu erreichen. Anders sieht es bei bi- bzw. multimodalen Transport- 
ketten aus, die explizit auf eine Verkehrsverlagerung ausgerichtet sind. Die (klas- 
sische) bimodale Losung im KLV ist die Kombination Schiene / StraBe, die aller- 
dings bisher nicht in dem gewiinschten Umfang realisiert werden konnte (s. u.a. 
von Stackelberg 1998). Ursache hierfur sind die mit den Umschlagvorg^gen 
beim Verkehrstragerwechsel verbundenen Kosten und Zeitablaufe, sowie die kon- 




194 



zeptionellen und organisatorischen Defizite im Bereich der Anbieter von Schie- 
nen(guter)verkehrsleistungen. Weniger stark beachtet sind bisher die Kombi- 
nationen BinnenwasserstraBe / StraBe und BinnenwasserstraBe / Schiene, u.a. be- 
dingt durch Restriktionen, die in der Binnenschiffahrt auftreten, so auch in den 
verkehrstragerbezogenen Schnittstellen. Fiir diese Form des KLV wird im folgen- 
den ein innovatives (und patentiertes) Konzept vorgestellt, das eine Losung fur die 
Schnittstellenproblematik anbietet, die auch unter Kostengesichtspunkten als effi- 
zient anzusehen ist. 



2 Systemkonzept 

Das Systemkonzept basiert auf Ro/Ro-Binnenmotorschiffen bzw. Schubleich- 
tern unterschiedlicher Auslegung mit der Moglichkeit einer Aufhahme von Stra- 
Ben- und / oder Schienenfahrzeugen. Der aus logistischer Sicht entscheidende 
Vorteil der Ro/Ro-Technik besteht darin, daB ortsfeste Infrastruktureinrichtungen 
nur in einem geringem Umfang (u.a. in Form von Rampen) benotigt werden und 
keine (investitionsintensiven) Krananlagen erforderlich sind. Werden komplette 
Fahrzeugeinheiten geladen (z.B. Sattelauflieger) sind keine zusatzlichen Ro/Ro- 
Zugmaschinen notwendig. Sollen allerdings nur Sattelauflieger (oder Eisenbahn- 
waggons) transportiert werden, miissen die Be- und Entladevorgange mit Hilfe 
von (mitgefuhrten) Zugmaschinen durchgefuhrt werden. Die Ablaufe, die einem 
solchen Systemkonzept zugrunde liegen, lassen sich nur bedingt in einer einheitli- 
chen Struktur darstellen. Sie werden deshalb im Abschnitt 3 anhand von zwei 
grundlegenden Ansatzen beschrieben. 



2. 1 Fahrzeugkonzept 

Das Fahrzeugkonzept basiert auf Binnenmotorschiffen und Schubleichtem, wo- 
bei in der groBenmaBigen Auslegung ein Verband ein GroBmotorschiff (GMS) mit 
einer Abmessung von 110611. 4 m sowie einen Schubleichter mit der Abmessung 
von 76.5 0 11.4 m bzw. zwei dieser Schubleichter und eine Schubeinheit umfaBt. 
Bei dieser Festlegung wird von den in weiten Teilen des BinnenwasserstraBennet- 
zes einsetzbaren Standardfahrzeugen ausgegangen. In Abhangigkeit von den spe- 
zifischen Bedingungen von den einzelnen Netzabschnitten konnen diese Dimensi- 
onierungen auch unter- bzw. uberschritten werden. 

Die wesentliche technische Erganzung der Fahrzeuge ergibt sich durch den 
Einbau von hydraulisch betriebenen Stempeln im Bugbereich (s. Abb. 1), mit Hil- 
fe derer sich die Stabilitat beim Be- und Entladen an einer Rampe gewahrleisten 
laBt. AuBerdem wird hiermit ein Absinken der Ladeoffnung unter die Wasserlinie 
verhindert, die sich bei einem zunehmenden Ladegewicht ergeben kann. Erganzt 
werden kann diese Technik durch an den Hydraulikstempeln angebrachte Lafet- 
ten, die ein begrenztes Auffahren an den Rampen ermoglichen. Technische 
Grundlage konnen die fur das Automatic Loading System (ALS) im KLV entwi- 
ckelten selbstfahrenden Lafetten sein. Der Laderaum wird mit drei parallel verlau- 
fenden Gleisen ausgeriistet, die fur die Aufnahme der Eisenbahnwaggons vorge- 
sehen sind, aber auch der Spurfuhrung fur Sattelschlepper bzw. Sattelauflieger 





195 



dienen, die aufgrund der geringeren Breite in vier Reihen aufgestellt werden kon- 
nen. 




Abb. 1: Schubleichter Seitenquerschnitt und GrundriB (Ausschnitt) 



2.2 Infrastrukturanforderungen 

Die Infrastruktur beschrankt sich auf Rampen fiir die Durchfuhmng von Be- 
und Entladevorgangen. In einer Reihe von Hafen sind diese vorhanden, ohne al- 
lerdings eine quantitativ ausreichende Basis fiir das Systemkonzept zu gew^leis- 
ten. Entscheidend ist die (weitgehend) flachendeckende Verfugbarkeit von Ram- 
pen durch die aus militarischen Uberlegungen eingerichteten Ersatzixbergangs- 
stellen im gesamten BinnenwasserstraBensystem der BRD, so z. B. entlang der 
Elbe mit insgesamt 28 Ubergangsstellen. Da nicht alle dieser Rampen unmittelbar 
an das StraBen- bzw. Schienennetz angeschlossen sind, werden gegebenenfalls fiir 
die zu nutzenden Standorte entsprechende InfrastrukturmaBnahmen erforderlich. 
Erfolgt das Be- und Entladen in Hafenanlagen erfolgt, konnen auch vorhandene 
Umschlageinrichtungen genutzt werden, soweit diese die technischen und kapazi- 
tiven Anforderungen erfiillen. 

Da die Rampen mit der Stromrichtung ausgelegt sind, miissen die Motorschiffe 
bzw. Schubleichter bei Talfahrt zum Be- und Entladen eine Richtungsanderung 
vomehmen, da die Ladevorgange fiber die geoffnete Bugklappe erfolgen. Es kann 
hierbei vorausgesetzt werden, daB in der Nahe einer genutzten Rampe geeignete 
Wendestellen verffigbar sind, soweit die WasserstraBenbreite die entsprechenden 
Wendemanover im Bereich der Rampe nicht zulaBt. Nur im Extremfall ergeben 
sich hieraus Restriktionen fiir die Lange der Fahrzeugeinheiten, die u. U. die Wirt- 
schaftlichkeit negativ beeinflussen konnen. 






196 



3 Anwendungen 

Anwendungen fur das vorgestellte Systemkonzept bestehen sowohl beim Gii- 
temah- als auch beim Guterfemverkehr. Die hierbei zugrunde liegenden Zielset- 
zungen sind sehr vielschichtig, wobei mogliche Kostenreduzierungen im Vorder- 
grund stehen. Diese lassen sich unter den derzeitigen Rahmenbedingungen primar 
bei den Personal- und Kraftstoffkosten erreichen. Allerdings ergibt sich zukiinftig 
fur den Schwerlastverkehr durch die Vermeidung der ’’streckenbezogenen LKW- 
Gebiihr auf Autobahnen" eine zusatzliche Kosteneinsparung. Daneben bestehen 
weitere Vorteile, die allerdings nicht unmittelbar quantifizierbar sind, sondem nur 
als bewerteter Nutzen in die Kalkulationen einbezogen werden konnen. Vorhan- 
dene Verlagerungseffekte spielen bei diesen Uberlegungen zur Wirtschaftlichkeit 
keine wesentliche Rolle, da diese als verkehrspolitische und okologische Aspekte 
der makrologistischen Ebenen zu zurechnen sind, und nicht in eine (betriebliche) 
Kostenbetrachtung eingehen. 

Voraussetzung fur die Realisierung derartiger bimodaler Konzepte ist die Ver- 
fugbarkeit eines geeigneten BinnenwasserstraUennetzes. Dies ist in der BRD in 
vielen Gebieten gegeben, so daB entsprechende Umsetzungsmoglichkeiten beste- 
hen. Einschrankungen in der Nutzung des BinnenwasserstraBensystems (Eisgang, 
Wassertiefe) miissen beriicksichtigt werden, wobei diese keinen grundsatzlichen 
Hinderungsgrund fur eine Realisierung darstellen. Im Fall einer temporaren 
Nichtverfugbarkeit der Infrastruktur existieren Riickfallebenen, da in einer solchen 
Situation auf monomodale Ablaufe, d.h. die ausschlieBliche Nutzung des StraBen- 
bzw. Schienenguterverkehrs zuriickgegriffen werden kann. 



3.1 Anwendungsbeispiel im Giiternahverkehr 

Fiir den Giiternahverkehr (Nah- und Regionalbereich) ergibt sich u.a. eine An- 
wendungsmoglichkeit bei einer Versorgung innerstadtischer Gebiete fiber peripher 
gelegene Giiterverkehrseinrichtungen (s. Daduna 2000), z. B. einem Giiterver- 
kehrszentrum (GVZ). Innerhalb konventioneller Strukturen erfolgt der Einsatz der 
Lieferfahrzeuge im GVZ normalerweise in den friihen Morgenstunden, wobei 
haufiger durch die ebenfalls stadteinwarts flieBenden (Pendler-)Verkehre tempora- 
re Behinderungen durch Staus auftreten, die zu Storungen in den Ablaufen fiihren 
konnen. Vermeiden lassen sich diese durch einen bimodalen Ansatz, bei dem die 
Lieferfahrzeuge abends bzw. in den Nachtstunden geladen und anschlieBend per 
Binnenschiff zu einem innerstadtischen Ubergabepunkt transportiert werden. Dort 
erfolgt die Ubemahme der Fahrzeuge durch die Fahrer / Fahrzeugbesatzungen, die 
dann unmittelbar mit den Liefervorgangen beginnen. Der Riicklauf der Fahrzeuge 
wird analog abgewickelt. Um effiziente Ablaufstrukturen zu erreichen, ist eine ge- 
eignete Strukturierung der Auslieferungszonen vorzunehmen. Hierbei kann bei 
(zum Beispiel taglich) variierenden Kundenstandorten unter Verwendung (aggre- 
gierter) Nachfragepunkte auf bindre bzw. Single Source Warehouse Location- 
Modelle (s. u.a. Klose 2001, S. 185ff) zuriickgegriffen werden, wahrend bei be- 
kannten (deterministischen) Kundenstandorten Location-Routing-Modelle (s, u.a. 
Klose 2001, S.293ff) zur Anwendung kommen konnen. 




197 



Die Vorteile liegen in den Kosteneinsparungen, die sich durch die Fahrzeug- 
biindelung ergeben, sowie in der Einsparung der anstehenden LKW-Gebuhren. 
AuBerdem l^t sich die Bedienungsqualitat (u.a. Piinktlichkeit bei den Belieferun- 
gen) verbessem, da kritische Streckenabschnitte umgangen werden und die Kun- 
denstandorte naher am aktuellen Einsatzpunkt der Fahrzeuge liegen. 



3.2 Anwendungsbeispiel im Guterfernverkehr 

Im Giiterfemverkehr kann mit diesem Konzept u.a. eine effiziente Veiteilung 
von aus Ubersee eingehenden Containem bei stark dislozierten Zielorten durchge- 
fuhrt werden. Ausgehend von einem Seehafencontainerterminal erfolgt auf der 
Basis von Sattelaufliegem ein gebiindelter Weitertransport zunachst iiber das Bin- 
nenwasserstraBennetz, wahrend die Feinverteilung im StraBengiiterverkehrlauft, 
der an den Landestellen einsetzt. Da ablaufbedingt und auch aus wirtschaftlichen 
Uberlegungen nicht jede der verfugbaren Rampen genutzt werden kann, sind ent- 
sprechende Standortplanungen durchzufiihren. Diese lassen sich auf Hub Locati- 
on-Probleme (s. u.a. Campbell et al. 2002) zuriickfiihren, wobei die potentiellen 
Standorte sich in der Regel auf einer Linie liegen. Analog lassen sich auch Sam- 
melablaufe gestalten, z. B. in der Zufiihrung zu Seehafencontainerterminals. Au- 
Berdem besteht die Moglichkeit, Pick-up and Delivery-Verkehre einzurichten, 
wobei aufgrund der Komplexitat der Ablaufe ein effizientes Informationsmana- 
gement und der Einsatz geeigneter Planungstools zwingend erforderlich wird. 

Aufgrund der geringen Transportgeschwindigkeit eignet sich dieser Systeman- 
satz nur bei Ablaufen, die nicht zeitkritisch sind. Somit bilden Massengiiterver- 
kehre (u.a. auch in der Entsorgungswirtschaft) sowie Transporte bei ausreichender 
zeitlicher Planbarkeit die wesentlichen Potentiate. Die okonomischen Vorteile lie- 
gen in den Kosteneinsparungen ffir Personal und Kraftstoffe sowie in der Ver- 
meidung von StraBennutzungsgebiihren. Ein weiterer Vorteil ist in der Erhohung 
des zulassigen Fahrzeuggesamtgewichts bei LKW-Transporten im KLV, wodurch 
sich Kapazitatsausweitungen von ca.15% realisieren lassen. AuBerdem ist eine 
Nutzung bei nicht zeitkritischen Fahrzeugriicklaufen moglich, so z. B. fur eine 
wirtschaftliche Losung bei der Leerwagenbereitstellung im Schienenguterverkehr. 



4 Zusammenfassung und Ausblick 

Die beiden Anwendungsbeispiele machen deutlich, daB ein derartiges System- 
konzept eine sinnvolle Erganzung im (bi- bzw. multimodalen) KLV darstellt. Es 
miissen aber auch die Grenzen gesehen werden, die sich aus der Netzstruktur und 
deren kapazitiver Verfiigbarkeit ergeben sowie auch aus den Anforderungen von 
Verladem und Kunden. Punkt-zu-Punkt-Verbindungen sind hierbei weniger prob- 
lematisch, da auf den entsprechenden Relationen lediglich die notwendigen (tech- 
nischen) Voraussetzungen gegeben sein miissen, die explizit iiberprufbar sind. Bei 
flachenorientierten Problemstellungen werden nicht alle Standorte integrierbar 
sein, sondem nur solche, die in einem Korridor entlang der zu nutzendenBinnen- 
wasserstraBen liegen. AuBerdem lassen sich in dieser Form keine zeitkritischen 
Transporte abwickeln, da die Transportgeschwindigkeit dieses Verkehrstragers im 




198 



Vergleich zum Strafiengutertransport geringer ist. Einschrankungen in der Netz- 
verfiigbarkeit aufgrund von Wittemngseinflussen (Eisgang, Hoch- und Niedrig- 
wasser) konnen auftreten, ohne allerdings ein gnmdsatzliches Systemproblem dar- 
zustellen. Kommt es zu diesen EinschrMcungen, ist in soweit eine leistungsfahige 
Riickfallebene gegeben, da der LKW letztendlich ein geschlossenes monomodales 
System bildet, das autark operieren kann. 

Eine detaillierte Kostenabschatzung fur derartige Konzepte kann nur fallbezo- 
gen erfolgen, u.a. bedingt durch die Fahrzeugauslegung und den u. U. anfallenden 
Investitionsaufwand fur die Qualifizierung vorgesehener Landestellen. AuBerdem 
besteht bei den Kalkulationen derzeit noch ein Unsicherheitsfaktor, da, wie die 
derzeitigen Planungen vorsehen, ein Einstieg in diese neue Technologie mit den in 
den Richtlinien zur Forderung neuer Kombinierter Verkehre auf Schiene und 
Wasserstrajie vorgesehen Bezuschussungen investiver MaBnahmen und operativer 
Kosten durch das Bundesministerium fiir Verkehr, Bau- und Wohnungswesen ge- 
zielt unterstiitzt werden soil. In der Grundtendenz sind positive Ergebnisse zu er- 
warten, auch ohne eine Einbeziehung monetarisierter Auswirkungen auf den Mo- 
dal split im Giiterverkehr durch eine Realisierung von Verlagerungseffekten. 



Literaturverzeichnis: 

Campbell, J.F. / Ernst, A. / Krishnamoorthy, M. (2002): Hub Location Problems, in: 
Drezner, Z. / Hamacher, H.W. (eds,): Facility Location - Applications and Theory. 
Berlin etal., 373-407 

Daduna, J.R. (2000): Logistische Strukturen und Prozesse. in: Ammann, P. / Daduna, J.R. / 
Schmid, G. / Winkelmann, P.: Distributions- und Verkaufspolitik. Koln, 281-310 

Daduna, J.R. (2001): Planung und Steuerung im Strafiengutertransport unter dem Einflufi 
von Informationstechnologien. in: Sebastian, H.-J. / Griinert, T. (Hrsg.): Logistik- 
Mangement. Stuttgart et al, 257 - 269 

Daduna, J.R. / Vofi, S. (2000): Informationsmanagement im Verkehr. in: Daduna, J.R. / 
Vofi, S. (Hrsg.): Informationsmanagement im Verkehr. Heidelberg, 1-21 

Klose, A. (2001): Standortplanung in distributiven Systemen. Heidelberg 

Seidelmann, C. (1997): Der Kombinierte Verkehr - Ein Uberblick. in: Internationales Ver- 
kehrswesen 49, 321 - 324 

Stackelberg, F. von (1998): Die Binnenschiffahrt im Kombinierten Ladungsverkehr. in: 
Hartwig, K.-H. (Hrsg.): Kombinierter Verkehr. Gottingen, 93-122 





Optimal Routing of Snowplows-A Column 
Generation Approach 



Nima Golbaharan, Per Olov Lindberg, and Maud Gothe Lundgren 

Linkopings Universitet, Department of Mathematics, SE-581 83 Linkoping, 
Sweden, e-mails: nigol,polin, mabre Qmai.liu.se 



Abstract. In countries with heavy winters, winter road maintenance, i.e. snow 
removal, salting etc, is an important problem. In Sweden the government and mu- 
nicipalities together spend close to 0.3 GEUR every year for winter road mainte- 
nance. Approximately half of this is snow removal cost, which mainly is the cost 
for snowplows, which in turn mainly depends on the routing of the snowplows. In 
this paper we study optimal routing of snowplows after snowfall. One has then to 
design a set of routes starting and ending at given depots, such that each road 
segment gets plowed within a prescribed time window, depending on the class of 
road segment. Our solution approach is based on Dantzig- Wolfe decomposition with 
column generation. In order to obtain an integer solution to the master problem we 
have applied two heuristic procedures, one using branch-and-bound on a subset of 
the columns and the other one a greedy procedure. 



1 Introduction 

In this paper we consider a special case of the problem of identifying routes 
with associated schedules for snowplows in order to minimize the total cost 
of the snow removal operation. The problem can roughly be decomposed into 
two classes; routing (and scheduling) during snowfall and routing after snow- 
fall. We are interested in modeling the problem for the after snowfall case 
and finding a solution to this problem that fulfills the constraints defined 
by the Road Administration Authority. The solution approach is based on 
Dantzig- Wolf decomposition in which the variables are generated by a col- 
umn generation scheme. This yields a fractional solution with an objective 
value that is a lower bound to the total routing cost. 

Despite the importance of snow removal, gritting and salt spreading, related 
work on the subject has been limited. In 1984 Lemieux and Campagna [1] 
solve a single-depot snow removal problem, which is stated as plowing a set of 
streets on both sides, once in each direction, while respecting the priorities of 
the road segments. Cook and Alprin [2] develop a dynamic routing heuristic. 
Closest Street Heuristic, for snow and ice removal in an urban environment. 
Eglese [3] presents a solution method for solving a routing problem for win- 
ter gritting in a local authority area. Campbell and Langevin [4] describe 
urban snow removal and disposal operations in detail. They write as con- 
clusion, ‘^Because of the inherent complexity of the problems, an interactive 




200 



decision support system would be most useful to allow solutions to be adjusted 
to incorporate political and difficult to quantify factors. Hence, fast heuris- 
tic algorithms that produce good approximate solutions will be preferred over 
optimal algorithms that require excessive computer time or power 

2 Problem Description and Mathematical Formulation 

2.1 The Snow Removal Routing Problem After Snowfall 

The snow removal routing problem after snowfall, SRRPAS, is the mathe- 
matical problem of routing a set of snowplows after snowfall, so that every 
road segment is plowed exactly once and the total cost of plowing is mini- 
mized. In practice, there are additional restrictions on the solution that have 
to be considered. Usually, a time window is associated to each road segment 
in the road network, which gives rise to a more restricted version of SR- 
RPAS referred to as the snow removal routing problem with time windows 
after snowfall, SRRPASTW. That is, the problem consists of designing a set 
of routes for snowplows at a minimal cost and such that all road segments 
that are to be plowed are covered within specified time limits. 

2.2 Mathematical Formulation of the SRRPASTW 

We formulate SRRPTWAS from an operator’s point of view, which means 
that the goal is to minimize the overall operational cost. Let Q = (A/*,^) be 
a network with node set J\f and arc set S. Each edge e = (i,j) £ S, has an 
associated set of data indicating plowing cost, plowing time, transportation 
cost and transportation time. A route is a path in the network, starting and 
ending at a depot, together with an indication of whether the edges in the 
path are plowed or just used for transport. Let V be the set of depots in 
the network, 1Zd the set of all possible routes originating at depot d £ V, 
TZ = uTZd and £p C S the set of road segments that need to be plowed. Let 
Vd be the number of snowplows available at depot d, Wd a charge for using 
one extra snowplow in addition to those available at depot d, and Cr the 
operating cost of route r £lZ. Further, let Oer be equal to 1 if edge e £ £p is 
plowed in route r, 0 otherwise. The problem is then formulated as follows: 

[P] mina,,y ^ CrXr + WdVd (2.1a) 

r€7^ dev 

S.t. 

OerXr >1 Vc G £p, (2.16) 

ren 

Xr <Vd^-yd Vd G V, (2.1c) 

reUd 



Xr £ {0,1} Vr G7^, (2.1d) 
Vd > 0, integer, Vd G V. (2.1e) 




201 



Here, Xr indicates whether route r is in the solution or not, whereas yd is 
the number of extra snowplows used at depot d. The first term of (2.1a) is 
the total cost of routes in the solution and the second one is the charge for 
extra snowplows. Constraint (2.16) says that every edge in Ep must be plowed 
at least once. It is correct to allow overcovering of edges, since if two routes 
plow the same edge, one of them may be modified to transport on the edge, 
giving a solution with a lower cost. Constraint (2.1c) is the depot fleet size 
constraint. The problem [P] is an extension of a set covering problem, which 
is a well-known problem in the class of NP-hard problems, Papadimitriou et 
al. [5]. Thus the SRRPTWAS is a hard problem. 

3 Application of Dantzig- Wolfe Decomposition to 
SRRPASTW 

3.1 The Master Problem 

Relaxing the integrality requirements we obtain Q < yd ^ R and 0 < < 1, 

which can be further relaxed to > 0. To simplify the discussion of the 
solution algorithm, we will proceed by ignoring the variables yd, since these 
do not affect the covering constraints. Lagrangian relaxation of (2.16) and 
(2.1c) by multipliers Ag > 0 and > 0 gives the relaxed problem 

[P^^] min ^ CrXr + ^ Xr ~ Vd 

ren eeSp \ ren / dev KreUd 

The coefficient Cr consist of two parts, cf and c^, the total cost of plowing 
and the total transport cost of route r, respectively. The reduced cost of Xr 
is then c^ (A) = cf -h c^ - The term cf - = 

YeeSp^^e ~ Y)0'er implicitly defines the reduced plowing cost for an edge e, 
where cf is the plowing cost of an edge e. 



3.2 The Subproblems 

The problem of identifying the column (route) with the most negative reduced 
cost, can be seen as a shortest path problem with time windows, SPPTW. To 
solve a subproblem, one for each depot, we consider a so-called label-setting 
algorithm. 

Labeling: Consider an arbitrary road segment that is represented by an edge 
e — (i^j). Assume that plowing the road segment takes tf time units at the 
cost of cf . Transporting on e causes the cost cf and the transportation time 
t^ with tf > tj and cf > cf . Further, there is a time window [0, 6g] specified 
for edge e. Every label at a node i e Af corresponds to a path from the given 
depot to i and is a triplet (c, t,p), where c is the cost of the corresponding 





202 



path; t is the time spent along the path and p points at the label that pre- 
cedes the label in the path. Moving from node i, with label L = {ci.U.pi), 
gives rise to two different labels, defined below, at each neighbor j accessible 
from node i. 



ft' + (f; 

LPIow . J ^ _ X-. iTransport . ) ^ ^ 

[p'=L [p'! = L 

Creating labels in this manner, we conclude that the total number of labels 
grows rapidly as the number of treated nodes increases, since a label at a 
node generates two labels at each of its neighbors. 

Cycle Avoidance: If Cg = cf — Ag < 0, then plowing the edge e is ad- 
vantageous. During the course of the solution process we may thus create a 
negative cycle by plowing all or some of the edges in a cycle. We have applied 
a simple procedure with a constant time complexity, to forbid negative cy- 
cles. We define a matrix M, where the columns represent edges and the rows 
represent labels. An entry of M with value 1 indicates that the corresponding 
edge has been plowed earlier in the route corresponding to the label. Before 
starting to plow an edge e, using label the question of whether the edge is 
already plowed in the path or not, is answered by checking the (£, e) entry of 
the matrix. This is done in 0(1) time. 

Time Discretization: Since the number of labels increases exponentially, 
problems involving wide time windows often require some special treatment. 
To simplify the problem, all time values in the problem have been discretized. 
By doing so, we reduce the number of labels during the column generation 
procedure. The discretization is done by multiplying all time values by some 
scalar and rounding up to the nearest integer. The value of the chosen scalar 
determines the degree of discretization. 

To avoid unnecessary labels, some label dominance criteria are utilized. 
Eliminating dominated labels may affect performance of the application dra- 
matically. A label is dominated and can be deleted from further consideration 
whenever any other label at the same node has better time and cost. 

3.3 Finding an Integer Solution 

Branch-and-Bound: One way of finding an integer solution is to use a 
branch-and-bound procedure. However, since branch-and-bound has expo- 
nential worst-case performance, it may take unreasonably long time to find 
and verify an optimal integer solution. We have applied a variable reduction 
technique, as described below, to speed up the branch-and-bound process. 
Variable Reduction: Let T G (0, 1) be a choosen reduction threshold. Given 
a master problem with a large pool of columns, 17, we select a variable Xr 
with current value Xr set Xr = 0 and resolve the master problem. This 
procedure is repeated untill there is no variable k £lJ with xu < T. Experi- 
ence and knowledge about the fractional solution obtained from the column 





203 



generation procedure can be used to choose a proper value for T. After the 
variable reduction procedure, we solve the remaining integer problem with 
branch- and-bound . 

A Greedy Approach: We have tried a greedy search procedure similar 
to those used for set covering problems. We successively select columns ac- 



cording to criteria such as j* = argmirij 



Cj +Wd 



If j* is selected, the edges 



plowed by j* are changed to transport edges in the unselected routes and 
their costs are changed accordingly. Xj* is then set to 1 and the column j* is 
removed from further consideration. This process is repeated untill all edges 
are covered . 

Post Processing: To improve the integer solution obtained from the greedy 
procedure we have developed a post processing procedure. Assume that S = 
T U P is the set of road segments between nodes rii and ri 2 in a given route, 
with P beeing plowed segments and T those used for transport. Applying 
the greedy procedure may render P == 0, which means that a new transport 
stretch between node rii and ri 2 is created. The post processing replaces this 
stretch by the shortest path between ni and ri 2 . These modifications actually 
result in generation of new routes, which yield an improved integer solution. 



4 A Case Study and Computational Results 

In order to study the performance and the quality of the solutions generated 
by the decomposition scheme presented in Section 3, we have performed a 
number of tests on data representing the Operation District of Eskilstuna, 
ODE, which is located in the middle of Sweden. The network representing 
ODE has 814 edges and 363 nodes of which 7 are depots. According to op- 
erative standards set forth by the Swedish National Road Administration, 
road segments are divided into four standard classes with different time win- 
dows after snowfall when the road segment should be plowed. In ODE the 
time windows are [0, /c] with A: = 4, 6 or 8 hours. Tables 1 and 2 give some 
computational results. In the tests, we first generated a varying number of 
columns, and then applied the greedy or the branch-and-bound heuristic. 



4.1 Computational Results 

The first and second columns of both tables give the number of generated 
columns and the cpu-time (in seconds) for generating them, and the third col- 
umn contains the corresponding objective value. The forth and fifth columns 
display the integer objective value and the additional cpu-time to obtain an 
integer solution. The greedy approach is executed on the complete set of 
columns obtained from the column generation phase, but branch-and-bound 
on very limited numbers of columns (indicated by column #Col) after the 
variable reduction procedure. The column Zpost displays the improved ob- 
jective value obtained by the greedy procedure after post processing. Table 




204 



Table 1. Results of greedy approach 



Problem 


CPU CG 


Zip 


Z greedy 


CPU INT 


Zpost 


#SP 


ODE2000 


718 


51 920 


68 237.0 


32 


59 811.5 


24 


ODE5000 


4 141 


48 788.5 


66 233.8 


79 


57 615.8 


23 


ODE8000 


10161 


47 648.7 


71 290.9 


136 


61 652.9 


25 


ODEIOOOO 


17 046 


47 157.9 


64 442.4 


156 


55 578.4 


22 


ODE15000 


35 936 


46 192.9 


70 244.5 


253 


60 652.9 


24 



Table 2. Results of branch- and-bound 



Problem 


CPU CG 


Zip 


Zbb 


CPU INT 


#Col 


#SP 


ODE2000 


718 


51 920 


58 581.8 


14 714 


143 


25 


ODE5000 


4 141 


48 788.5 


74 103.5 


72 268 


259 


24 


ODE8000 


10161 


47 648.7 


57 312.7 


6 262 


277 


24 


ODEIOOOO 


17 046 


47 157.9 


56 846.0 


12 032 


287 


23 


ODE15000 


35 936 


46 192.9 


56 833.0 


28 605 


310 


24 



1 also shows that the greedy approach is both fast and gives good solu- 
tions, in particular when post processing is used. A critical point for the 
greedy approach is the choice of the rule of selecting the columns and for the 
branch-and-bound approach the right value of the threshold when reducing 
the number of variables. The last column (indicated by #SP) in both Tables 
1 and 2 contains the number of snowplows used in the solution. 

References 

1. Lemieux, P. F., Campagna L. (1984) The Snow Plowing Problem Solved by a 
Graph Theory Algorithm. Civ. Eng. Sys. 1: 337-341 

2. Cook, T. M., Alprin B. S. (1976) Snow and Ice Removal in an Urban Environ- 
ment. Man. Sci. 23: 227-234 

3. Eglese, R. W. (1994) Routing Winter Gritting Vehicles. Dis. App. Math. 48: 
231-244 

4. Campbell J. F., Langevin A. (1995) The Snow Disposal Assignment Problem. 
Jour, of the Oper. Res. Soc. 46: 919-929 

5. Papadimitriou C. H., Steiglitz K. (1998) Combinatorial Optimization, Algo- 
rithms and Complexity. Dover Publications Inc., New York 





Savings Based Ants for Large-scale Vehicle 
Routing Problems 



Marc Reimann and Karl Doerner 

University of Vienna, Department of Management Science, Bruenner Strasse 72, 
A- 1210 Vienna, Austria 



Abstract. In this paper we present a modified version of our Savings based Ant 
System for large-scale instances of the Vehicle Routing Problem (VRP). The main 
idea is to speed up the search by letting the ants solve only sub-problems rather 
than the whole problem. This is particularly necessary, when one has to solve large 
real world instances, for which the computation times of classic met a- heuristics are 
prohibitive. Our model is also inspired by Taillard’s work on the VRP ([!]), where 
the Tabu Search procedure employed solves only some sub-problems. 



1 Introduction 

The VRP, being one of the central problems in distribution logistics, consists 
of finding a set of minimum cost vehicle routes that start and end at a central 
depot, such that each customer is fully served by exactly one vehicle. The 
fleet is homogeneous and each vehicle can carry goods up to its capacity. 
Additionally constraints on the maximum possible duration of each vehicle’s 
tour may exist. This problem is a generalization of the TSP. Thus, besides 
the routing aspect already existing in the TSP one has to find an assignment 
(or clustering) of customers to vehicles. 

In Doerner et al. ([2]) and Reimann et al. ([3]) we have proposed the 
Savings based Ant System for the VRP. In these papers we have shown the 
effectiveness of the approach for the standard benchmark problem instances. 
However, we observed from our simulations that the efficiency in terms of 
the scaling behavior of the algorithm was poor. More specifically the Savings 
based Ant System took around three seconds to solve a problem with 50 
customers, whereas the solution of a 199 customer problem required up to 
40 minutes. This scaling behavior is prohibitive when the Ant System is to 
be applied to real world instances (with up to many hundreds of customers). 
Note however that the same argument applies for basically all other meta- 
heuristics, such that different authors propose various methods to speed up 
the search (e.g. [4]). Further, we also observed that the average solution qual- 
ity obtained by our Ant System increased from less than 0.1% for instances 
with 50 customers to almost 2% deviation from the best known solution for 
instances with 199 customers (c.f. [3]). 

Apart from these algorithmic observations we learned from our indus- 
try partners that human dispatchers tend to cluster customers according 




206 



to postal codes or other regional characteristics before they solve the much 
smaller sub-problems for each cluster separately. 

These three issues led to the development of our new algorithm. The main 
idea is to split up large problems into a number of smaller problems that can 
be solved both more effectively and more efficiently. Note however, that the 
splitting procedure as well as the recombination of the sub-problems are non- 
trivial problems themselves and we will focus on these issues in this paper. 
Note also, that a similar approach was used in combination with a Tabu 
Search algorithm by Taillard ([!]). 

The outline of the remainder of this paper is as follows. In the next section 
we discuss problem decomposition for VRPs. First we start with a brief review 
of Taillard’s algorithm for the VRP. After that we focus on the description of 
our new approach. We will present first, very preliminary results in section 3 
before we conclude with an outlook on future research. 



2 Decomposing the VRP 

2.1 Taillard’s Decomposition Algorithm 

Taillard ([!]) realized that good solutions to most VRP instances of moderate 
to large size feature some spatial characteristics that allow to exploit problem 
decomposition. In terms of Tabu Search, or more generally Local Search it is 
obvious that (local) changes only make sense if the nodes involved are close 
to each other. Moreover, simple change operators generally involve only two 
tours. Thus, moves can be evaluated and performed simultaneously. 

Taillard proposes two distinct methods for partitioning a problem in- 
stance. For uniform problems, a partition into sectors is suggested, while for 
non-uniform problems Taillard uses a partitioning method based on arbores- 
cences and associated shortest path. The core of the algorithm consists of 
the iteration of the following steps. First, the problem is partitioned with 
the appropriate method (as mentioned above). Each partition is then solved 
using a standard Tabu Search approach. After a certain number of iterations 
the sub-problems are re-joined. 

Using these methods, Taillard was able to find all the best known solutions 
for the classic benchmark instances for the VRP. However, the algorithm’s 
performance depended crucially on the initial decomposition and the parti- 
tioning method was problem dependent. 



2.2 Problem Decomposition and the Savings Based Ant System 

Let us now turn to the approach we propose to efficiently solve large-scale 
VRPs. Initially an Ant System solves the master problem for a given number 
of iterations (step I). Given the best found solution so far our algorithm 
determines for each route of this solution the center of gravity according to 





207 



the modified Miehle algorithm ([5]) (step II) . We then cluster these route 
centers using the Sweep algorithm as proposed by Gillett and Miller ([6]) (step 
III). Each of the resulting clusters is then solved independently by applying 
our Savings based Ant System for a given number of iterations (step IV). 
After all sub-problems have been solved we re-assemble the global solution 
and update the global pheromone information and if applicable the global 
best solution (step V). The steps I to V described above are repeated until a 
pre-specified time limit is reached. 

Let us now turn to each of the five steps in more detail, where we will 
also discuss implementation issues. 

• Step I: Generation of Initial Solutions 

The initial solutions are generated by applying the Savings based Ant 
System as described and analyzed in detail in Doerner et al. ([2]) and 
Reimann et al. ([3]). 

• Step II: The Modified Miehle Algorithm 

The modified Miehle algorithm is used to compute the center of gravity 
for each vehicle route. This algorithm iteratively adjusts the coordinates 
of the center, until the change in the weighted distance of all customers 
to the center is minimal. The term modified in the name denotes that the 
algorithm can handle a situation where the center of gravity coincides 
with the location of a customer. 

• Step III: The Sweep Algorithm 

Having computed the centers of gravity for each vehicle route, we then 
ignore the actual customer locations and apply the Sweep algorithm to 
cluster the nodes corresponding to the centers of gravity. More specifi- 
cally, a starting node is chosen randomly and the remaining nodes are 
sorted according to their polar angle with the depot and the randomly 
chosen starting node. Using the resulting order, the nodes are then as- 
signed to clusters. 

• Step IV: Solution of the Sub-Problems 

Sub-problems are solved using the Savings based Ant System already 
utilized in Step I. However, in this step each Ant System solves only the 
partial problem it is assigned in Step HI. 

• Step V: Communication between Processes 

The main notion with respect to the communication between the different 
processes is master pheromone information. In fact, we use one global 
memory for our algorithm. 

The communication between the sub-problems and the master process is 
based on two important components. First, each sub-problem is called 
with the associated part of the master pheromone information. In other 
words, each sub-problem receives only those parts of the master pheromone 
information necessary to solve its part of the problem. The sub-problems 
then change this pheromone information locally as they iteratively solve 
their instances. 




208 



Second, after a sub-problem has been solved, the corresponding process 
returns the best found solution and compares it with the previous best 
found solution. If an improvement was achieved, the best found solu- 
tion is updated and pheromone reinforcement in the master pheromone 
information occurs. Otherwise, only negative reinforcement in terms of 
pheromone evaporation is applied to the master pheromone information. 



3 Computational Experiments 

The problem instances used for our computational study are comprised of 
the 20 large scale instances proposed by Golden et al. ([7]). All of the in- 
stances feature one central depot, the customers are located either in con- 
centric circles or squares around the depot. The first 8 instances are distance 
and capacity constrained, whereas the last 12 instances are only capacity 
constrained. Service times are equal for all customers and set to zero. The 
sizes of the instances vary between 200 and 483 customers. 

In order to keep things simple we chose to stick to the parameter values 
that were found to be favorable by Bullnheimer et al. ([8]) for the rank based 
Ant System in general, and in one of our earlier works (c.f. [3]) for the Savings 
based Ant System in particular. 

Nonetheless, our new approach features a significant number of parame- 
ters that need to be adjusted. These are: 

• ^^master-- -number of master problem iterations 

• number of subproblem iterations 

• -total admissible runtime 

• rig... number of sub-problems 

For the results presented below we have used the following settings: 
itmaster = 2, itgub = 100. The runtime was chosen to reflect the differences 
in the problem sizes and varied between ttot = 10 and Uot = 20 minutes. The 
actual numbers are given in Table 3 below. Finally, an important parameter 
is the number of sub-problems. For the computational study presented below 
we have partitioned each problem instance into = 4 sub-problems. 

Table 3 summarizes our computational results. Listed are our Ant System 
(denoted by AS), the GA of Prins ([9]) and the Granular Tabu Search (GTS) 
approach of Toth and Vigo ([4]). While the results for the GA and for the GTS 
are best results given a fixed (the most favourable?) parameter constellation, 
we report average results over 5 runs for our Ant System. The computers 
used are a 200 MHz Pentium for the GTS, a 1 GHz Pentium for the GA and 
a 900 MHz Pentium for our Ant System. 

Prom the table two things become obvious. For these large scale instances 
the trade-off between solution quality and computational effort is significant. 
More specifically, this can be seen from the last row where we report the 
average relative percentage deviation (RPD) from the best known solution 





209 



Table 1. Comparison of three meta- heuristics on large scale problem instances 



Instance 


Best known 


AS 




1 GA I 


1 GTS 


# 


size 


solutions 


Sol.qual. 


min. 


Sol.qual 


min. 


Sol.qual 


min. 


1 


240 


5646.43 


5734.02 


10.00 


5648.04 


32.42 


5736.15 


4,98 


2 


320 


8447.92 


8629.60 


15.00 


8459.73 


77.92 


8553.03 


8.28 


3 


400 


11036.22 


11278.88 


20.00 


11036.22 


120.83 


11402.75 


12.94 


4 


480 


13624.52 


14192.10 


20.00 


13728.8 


187.60 


14910.62 


15.13 


5 


200 


6460.98 


6460.98 


10.00 


6460.98 


1.04 


6697.53 


2.38 


6 


280 


8412.80 


8507.82 


10.00 


8412.9 


9.97 


8963.32 


4.65 


7 


360 


10195.59 


10445.15 


15.00 


10267.5 


39.05 


10547.44 


11.66 


8 


440 


11828.78 


12158.22 


20.00 


11865.4 


88.3 


12036.24 


11.08 


9 


255 


587.09 


599.32 


10.00 


596.89 


14.32 


593.35 


11.67 


10 


323 


746.56 


765.32 


15.00 


751.41 


36.58 


751.66 


15.83 


11 


399 


932.68 


960.20 


15.00 


939.74 


78.5 


936.04 


33.12 


12 


483 


1133.79 


1167.86 


20.00 


1152.88 


30.87 


1147.14 


42.9 


13 


252 


868.80 


879.16 


10.00 


877.71 


15.3 


868.8 


11.43 


14 


320 


1086.24 


1111.13 


15.00 


1089.93 


34.07 


1096.18 


14.51 


15 


396 


1363.34 


1386.91 


15.00 


1371.61 


110.48 


1369.44 


18.45 


16 


480 


1650.42 


1683.62 


20.00 


1650.94 


130.97 


1652.32 


23.07 


17 


240 


709.9 


712.50 


10.00 


717.09 


5.86 


711.07 


14.29 


18 


300 


1014.80 


1015.66 


10.00 


1018.74 


39.33 


1016.83 


21.45 


19 


360 


1376.49 


1381.92 


15.00 


1385.6 


74.25 


1400.96 


30.06 


20 


420 


1846.55 


1853.29 


20.00 


1846.55 


210.42 


1915.83 


43.05 




Averages 


RPD (%) CPU 


RPD (%) CPU 


RPD (%) CPU 








1.23 


14.75 


1 0.56 


66.6 1 


2.1 


17.54 



together with the average computation times for all approaches. Our Ant 
System consumes approximately a fifth of the computation times reported by 
Prins, while we still take four times as much runtime as Toth and Vigo (given 
the differences in the machines used). On the other hand, the GA on average 
performs best, followed by our approach. With respect to solution quality 
Toth and Vigo’s approach comes last. Thus, none of the three approaches 
dominates or is dominated by any other approach when assessed by these 
two goals. However, our algorithm seems to find a nice balance between the 
two goals, whereas the competing approaches can somehow be viewed as the 
polar cases. 

Analyzing the worst case behavior of our algorithm we observed that on 
average the solutions found are within 2.3 % of the best known solutions. This 
worst case deviation is very promising if we consider that the implementation 
of our approach so far resembles merely a proof of concept. 

Finally, let us note that during the course of our experiments we were 
able to improve 2 of the best known solutions. For problem instance #18 we 
found 1012.35 and for problem instance #20 we found 1843.62. 





210 



4 Conclusions 

In this paper we have presented an approach to enhance the efficiency of 
an Ant System for the VRP. The main idea is to partition a starting solu- 
tion to the problem into several disjoint sub-problems, each of which is then 
solved using an Ant System process. We have shown that the algorithm is 
competitive when compared with other state-of-the-art techniques. 

Further analysis should deal with a comparison of the results of our new 
approach with those of the Savings based Ant System as described in our ear- 
lier work. Apart from that more research is required to better understand the 
effects of communication between the master process and the sub-problems. 

The main advantage of our approach is that it can be used in the same 
way for any kind of vehicle routing problem. Furthermore, the same idea can 
be used in dynamic, real-time vehicle routing problems to determine whether 
a new request can be accustomed or not, by optimizing only the subset of 
tours serving the area where the new request comes from. 

Acknowledgments: Financial support from the Oesterreichische National- 
bank (OeNB) under grant #8630 is gratefully acknowledged. 

References 

1. Taillard, E. D. (1993): Parallel iterative search methods for vehicle routing prob- 
lems. Networks 23 661-673. 

2. Doerner, K., Gronalt, M., Hartl, R. F., Reimann, M., Strauss, C. and Stummer, 
M. (2002): SavingsAnts for the Vehicle Routing Problem, in: Cagnoni, S. et al. 
(eds.): Applications of Evolutionary Computing, Springer, Berlin, 11-20. 

3. Reimann, M., Stummer, M. and Doerner, K. (2002): A Savings based Ant System 
for the Vehicle Routing Problem, in: Langdon, W. B. et al. (eds.): Proceedings of 
the Genetic and Evolutionary Computation Conference (GECCO 2002), Morgan 
Kaufmann, San Francisco, 1317-1325. 

4. Toth, P. and Vigo, D. (2002): The Granular Tabu Search and its Application to 
the Vehicle Routing Problem, to appear in: INFORMS Journal on Computing. 

5. Spaeth, H. (1977): Cluster- Analyse- Algorithmen zur Objektklassifizierung und 
Datenreduktion. Oldenbourg, Muenchen. 

6. Gillett, B. and Miller, L. (1974): A heuristic algorithm for the vehicle dispatch 
problem. Operations Research 22 340-349. 

7. Golden, B. L., Wasil, E. A., Kelly, J. P. and Chao, I. M. (1998): The impact 
of metaheuristics on solving the vehicle routing problem: algorithms, problem 
sets, and computational results, in: Crainic, T. G. and Laporte, G. (eds.). Fleet 
management and logistics, Kluwer, Norwell, 33-56. 

8. Bullnheimer, B., Hartl, R. F. and Strauss, Ch. (1999): An improved ant system 
algorithm for the vehicle routing problem. Annals of Operations Research 89 
319-328. 

9. Prins, C. (2001): A Simple and Effective Evolutionary Algorithm for the Vehicle 
Routing Problem. Reseach report. University of Technology of Troyes, Prance. 





Single Machine Scheduling Problems with 
Exponentially Start Time Dependent Job 
Processing Times 



Aleksander Bachman^, Adam Janiak^ and Mikhail Y. Kovalyov^ 

^ Institute of Engineering Cybernetics, Wroclaw University of Technology, 
Janiszewskiego 11/17, 50-372 Wroclaw, Poland 
^ Institute of Engineering Cybernetics, National Academy of Sciences of Belarus, 
Surganova 6, 220012 Minsk, Belarus 



Abstract. The paper deals with single machine scheduling problems, where job 
processing times are given by exponentially start time dependent functions. We 
consider decreasing and increasing functions of job processing times. For the in- 
creasing functions, we proved ordinary and strong NP-hardness of the makespan 
minimization problem under ready times restrictions. We obtained the same re- 
sults, i.e., ordinary and strong NP-hardness, for the maximum lateness minimiza- 
tion problem, but for the decreasing functions of job processing times. We proved 
also ordinary NP-hardness for the total weighted completion time minimization, 
under the assumption that job processing times axe characterized by decreasing 
functions. 

1 Introduction 

Scheduling problems investigated in this paper deal with jobs, whose pro- 
cessing times are dependent on the starting moment of their execution. In 
the scientific literature, such a dependency is called a deterioration of a job 
processing time. In the past few years, several new models describing the job 
processing time deterioration have been introduced. Almost all of them are 
based on the linear functions, which are dependent on the starting moment 
of job’s execution. The recent literature survey for the problems concerning 
this subject can be found in [1]. The variety of the models, considered in 
the scientific literature, comes from the applications, where the process can 
be described by this kind of processing time. The large number of different 
models describing the job processing time deterioration complicates the in- 
dication of the computational complexity for the same optimization criteria. 
Therefore, we tried to generalize some results concerning the computational 
complexity of the problems with similar functions characterizing the job pro- 
cessing time for different optimization criteria. The remaining part of the 
paper is organized as follows. In the next section we formulate the problems 
considered in the paper. Section 3 deals with the computational complex- 
ity results obtained for the makespan, the maximum lateness and the total 
weighted completion time minimization problems. Section 4 contains some 
concluding remarks. 




212 



2 Formulation of the problems 



Each problem studied in this paper can be formulated as follows. There are 
given a single machine and n independent and non-preemptive jobs available 
for processing at their ready times > 0 which should be completed by their 
due dates dj. The processing time pi{t) of job i is a function dependent on 
the starting moment t of its execution: 



• an increasing model 

• a decreasing model 



Pi{t) == 



( 1 ) 

( 2 ) 



where Ui and bi denote a normal processing time and an increasing /decreasing 
rate, respectively. Note that for each job i, the change of its processing time 
Pi starts at its ready time r^. For each job i, a weight Wi indicating its relative 
importance might be given. The objective is to find a job sequence, for which 
the value of the considered criterion is minimized. In this paper, we investigate 
the following criteria: the makespan Cmax, the maximum lateness Lmax and 
the total weighted completion time YlwiCi. Adapting the notation a [ ^ | 7 
[3], we denote the above mentioned problems as 1 | ri,pi(t) — | 

C^maxj 1 I Pi{^) — I -^max and 1 | Pi{t) = dit | WiCj. 



3 NP-hardness results 

In this section, we establish the computational complexity results for the 
following two problems 1 | ri,pi{t) = | Cma.x and 1 | Pi{t) = 

I I/max- We prove also that the problem 1 | pi{t) = 1 Yl^wiCi 

is at least ordinary NP-hard. In order to prove NP-hardness of the problems 
mentioned above, we use Partition and 3-Partition [2], respectively. The 
decision versions of Partition and 3-Partition, denoted, respectively, by 
PP and 3PP, are formulated as follows. 

PP: Given m -h 1 positive integers 5^1 , . . . , and G such that 9i ~ ^^5 

is there a partition of the set {1, . . . ,m} into 2 disjoint subsets G\ and G 2 
such that YltieGj 9 i — Gioij — l^ 2? 

3PP: Given 3m -f 1 positive integers /ii, . . . , h^rn and H such that — 

mH and if/4 < < if/2 for i = 1, . . .,3m; is there a partition of the set 

{!,..., 3m} into m disjoint subsets ifi, . . . , Hm such that hi = H for 

j = l,...,m? 

3.1 Makespan 

In this subsection, we investigate the problems under the assumption that 
Wj — w and dj = d for j = 1, . . . , n. 





213 



Theorem 1 . The problem 1 | n G {0,R},Pi{t) = | Cmax is at 

least NP-hard in the ordinary sense. 

Proof. Given an instance of PP, construct the following instance of our 
problem. There are n = m + 1 jobs. Among them, there are m partition jobs 
1 , . . . , m, and a single enforcer ]oh v with the following parameters: 

n = 0 ; ai= gi\bi = 0 for i = 1 , . . . , m, 
ry = R = G] Oy — 1; by = m. 

We show now, that PP has a solution if and only if there exists a solution 
for the constructed instance of 1 | G {0,R},pi{t) = | Cmax with 

the value Cmax < y = 2G 1. 

‘^Only if’ . Assume that the subsets G\ and G2 give a solution for PP. Let 
Ji and J 2 denote the subsets containing the jobs constructed on the basis 
of the elements from the subsets Gi and G 2 , respectively. Construct a job 
sequence S such that the machine executes at first the jobs from the subset 
Ji in an arbitrary order, then the enforcer job v and finally the jobs from 
the subset J 2 in an arbitrary order, as well. The completion time Cmax of the 
last job in S is equal to 

^9 i,rv>+a^e V I ' / / + ^ p.. ( 3 ) 

ieGi ) ieGi 

Note that equation (3) is true for any solution of PP. Since 9i = 

Y^i£G2 ~ based on (3) we obtain Cmax = 2 G -f 1 , as it is required. 

‘Tf’. Assume now that any solution of PP will give YlieOi 9i ^ ^ieC2 9^‘ 
Since 9i — then there are two cases to be considered, namely (a) 

EteGi = G + A and 'Eiec^ 9i = G - X, and (b) EieCi 9i = G - X &nd 
— G -h A, where A is a non-zero integer. Based on (3), we obtain in 

case (a) Cmax = 2C + e^^ > y and in case (b) Cmax = 2 C + l-j-A>i/, which 
ends the proof. ■ 

Theorem 2 . The problem 1 | ri,pi{t) = | Cmax is NP-hard in the 

strong sense. 

Proof. Given an instance of 3PP, construct the following instance of our 
scheduling problem. There are n — 4m jobs. Among them there are 3m 
partition jobs 1, . . . , 3m, and m enforcer jobs for z = 1, . . . , m, with the 
following parameters: 

ri =0; Oi = hi; bi = 0; for z 1, . . . , 3m, 

ry. = zif -f (z - 1 ); Oy. — 1 ; by. = iif; for z = 1 , . . . , m. 

It can be shown that 3PP has a solution if and only if there exists a solution 
for the constructed instance of 1 | ri,pi{t) = | Cmax with the 

value Cmax ^ y — -f m. Because of the space limitations, the detailed 
description of this proof will be omitted. ■ 





214 



3.2 Maximum lateness 

In this subsection, we investigate the problems under the assumption that 
Wi — w and = r = 0. 



Theorem 3. The problem 1 | pi{t) = a^e G {d^D} | Lmax is at least 

NP-hard in the ordinary sense. 

Proof. Let’s start with the reduction from PP. There are n = m + 1 jobs. 
Among them there are m partition jobs 1, . . . , m, and a single en/orcer job v 
with the following parameters: 

di = d = 2G + (2e)“^; ai = Qi] h = 0; for i = 1, . . . ,m, 
dy=G-\- (2e)-i; a,; = 1/2; by = 1/G. 

We show now, that PP has a solution if and only if there exists a solution 
for the constructed instance of 1 | pi{t) = aie~^'^^d{ E {d, £)} | L^ax with 
the value Lmax < 2 / = 0. 

‘‘Only if’. Assume that the subsets Gi and G2 give a solution for PP. Let 
Ji and J 2 denote the subsets containing the jobs constructed on the basis of 
the elements from the subsets Gi and G 2 , respectively. A job sequence 5 is 
constructed as follows. The machine executes at first the jobs from the subset 
Ji in an arbitrary order, then the enforcer job and finally the jobs from the 
subset J 2 in an arbitrary order, as well. The maximum lateness Lmax for the 
sequence S is equal to 



Lmax = max 



^ + ayC 

ieGi 



^ ^ —6 

-dv,^gi+ay€ 



i=l 



Since 9i = = G, then based on (4) we have Lmax 

as it is required. 



( 4 ) 

= 0 = j/, 



“If’. Assume now that any solution of PP gives YlieG2 3'' 

Since 9i = 2G, then there are two cases to be considered, namely (a) 
Eieoi = G - A and 9i = G + X, and (b) T,ieG, 9i = G + X and 

^ieG2 — where A is a non-zero integer. By (4) we have in case (a) 

Lmax = (2e)~^ (e^ - l) > 0 = y. The result obtained in case (b) is equal to 
Lmax = - (2e)~^ -h A > 0. It means that PP has no solution, which 

ends the proof. ■ 



Theorem 4. The problem 1 [ pi{t) = a^e | Lmax is strongly NP-hard. 

Proof. Given an instance of 3PP, construct the following instance of the 
scheduling problem. There are n = 4m jobs. Among them there are 3m 
partition jobs 1, . . . , 3m, and m enforcer jobs for i = 1, . . . , m, with the 
following parameters: 





215 



di-d - mH + at = hf, bi-0; for i = 1, . . . , 3m, 

dy, = iif + ay, = 1/m; by, = [iff + for i = 1, . . . ,m. 

We can show that 3PP has a solution if and only if there exists a solution 
for the constructed instance of 1 | pi{t) = \ L^ax with the value 

i'max < 2/ = 0- However, because of space limitation, the detailed proof will 
be omitted. ■ 



3.3 Total weighted completion time 

We assume that Vj — r — Q and dj — d are the common values characterizing 
the jobs in the problem investigated in this subsection. 

Theorem 5. The problem 1 | pi{t) = | Y^wiCi is at least NP-hard 

in the ordinary sense. 

Proof. First a reduction from PP to the scheduling problem is given. Con- 
struct an instance of scheduling problem with n = m+1 jobs. The parameters 
of the first m jobs, called the partition jobs and the parameters of a single 
enforcer ]oh v are given as follows: 

- 9u O'i = 9u bi=0] for i - 1, . . . , m, 

Wy — G, Oy = (2mG)“^; by — 1. 

We show now, that PP has a solution if and only if there exists a solution 
for the constructed instance of 1 [ pi{t) = | Y^wiCi with the value 

E WiCi <y = Eti 9i + Ei<i<j<m 9i9j + 

‘‘Only if’. Assume that the solution of PP is given by two disjoint subsets 
Gi and G 2 for which J2ieOi ~ ^ie 02 *^2 denote the 

subsets containing the jobs constructed on the basis of the elements from the 
subsets Gi and G 2 , respectively. Let the subsets J\ and J 2 contain k and 
m — k jobs (0 < A: < m), respectively. If fc = 0 or A: = m, then there is no 
solution for PP. The solution for the considered scheduling problem is given 
as follows. The machine executes at first the jobs from the subset Ji in an 
arbitrary order, then the enforcer job and finally the jobs from the subset J 2 
in an arbitrary order, as well. The criterion value for any schedule tt described 
above is given by 

k k n n 

^Tr(i) ^7 t(j) + ^ ^Tr(i) ^7r(j) 

i—l j=i i=k-\-2 j—i 

i=ik-\-2 




( 5 ) 





216 



where the description 7r(i) denotes the job executed in the ith position in the 
schedule tt. Since ai = gi and Wi—Qi, then expression (5) is equal to: 



Y^WiCi = Y^gl+ Si9j 

l<i<j<m 



i=l 






G + ^2 Si 

i=k-{-l J 



r k > 

Ki=l } 



E 5. • (6) 



Since 9i = Ei=i 5i = G and Si = EHit+i Si = G, then ex- 

pression (6) is equal to the required value y = 9i + Yli<i<j<m 9i9j + 

“If’ . Assume now that any solution of PP represented by two disjoint sub- 
sets Gi and G 2 will give ^ Si€G 2 9^' There are two cases to be 

considered, namely case (a) gi = G — X and 9i — ^ ^ ^ 

case (b) Y^ieOi gi = G + X and 9 i - G - X, where A is some posi- 
tive integer. By (6), in case (a) we have Y^WiCi = y [e^ ~ l] + 

A [a -h > y. The result obtained in case (b) is equal to 

Y WiCi = y + [e~^ - l] -f- A [A - {2mG)~^e~^~^] > y. Since both 

results obtained above are greater than y, the theorem is proved. ■ 



4 Conclusions 

In this paper, we established the computational complexity status of three 
scheduling problems with exponentially dependent job processing times. We 
considered the following optimization criteria: the makespan, the maximum 
lateness and the total weighted completion time. In the NP-hardness proofs, 
we used the well known Partition and 3-Partition problems. Since the 
complexity status is clear now, in the future research we will focus on the 
construction of the efficient algorithms solving the problems considered in 
this paper. 



References 

1 . B. Alidaee, N.K. Womer, “Scheduling with time dependent processing times: 
Review and extensions”, Journal of the Operational Research Society, 50, 711- 
720 (1999). 

2. M.R. Garey, D.S. Johnson, “Computers and Intractability: A Guide to the The- 
ory of NP-completeness” , Freeman, San Francisco, 1979. 

3. R.L. Graham, E.L. Lawler, J.K. Lenstra and A.H.G. Rinnooy Kan, “Optimiza- 
tion and approximation in deterministic sequencing and scheduling: a survey” , 
Annals of Discrete Mathematics, 5, 287-326 (1979). 





Scheduling Problems with Optimal Due 
Interval Assignment Subject to Some 
Generalized Criteria 



Adam Janiak^ and Marcin Marek^ 

Institute of Engineering Cybernetics, Wroclaw University of Technology 
Janiszewskiego 11/17, 50-372 Wrocaw, Poland 



Abstract. The paper deals with two problems of scheduling jobs on identical par- 
allel machines, in which a due interval should be assigned to each job. Due interval 
is a generalization of well known classical due date and describes a time interval, in 
which a job should be finished. In the first problem, we have to find a schedule of 
jobs and a common due interval such that the sum of the total tardiness, the total 
earliness and due interval parameters is minimized. The second problem is to find 
a schedule of jobs and an assignment of due interval to all jobs, which minimize 
the maximum of the following three parts: the maximum tardiness, the maximum 
earliness and the due interval parameters. We proved that the considered problems 
are NP-hard and outlined some methods how to solve them approximately as well 
as optimally. 



1 Introduction 

In the scientific literature of scheduling theory, an assignment of due dates was 
considered (see [1], [2], [3]). However group of problems in which to each job 
a due interval should be assigned are also of permanent interest (see [6], [7], 
[8]). A due interval is a generalization of the classical due date and describes 
a time interval, in which a job should be finished. For the considered prob- 
lems, we should find a schedule of jobs and a common due interval such that 
the criterion value is minimized. In the first problem we minimize the sum of 
total tardiness, total earliness and due interval size. In the second problem 
we minimize the maximum of the following three parts: maximum tardiness, 
maximum earliness and due interval size. The application of the considered 
problems can be justified by many manufacturing systems in which the ne- 
gotiation between the producer and the customer occurs. The negotiation 
concerns the delivery of the final products. The producer objective is to have 
the latest time of delivering products, while the customer tries to have them 
as soon as possible. The compromise of this negotiation is a due interval, i.e., 
the time period in which the products are completed by producer and are 
available to be taken by customer. 

The reminder of the paper is organized as follows. In next section we for- 
mulate problems more precisely. Sections 3 and 4 are, respectively, devoted to 
properties and the computational complexity of the first problem. In Section 




218 



5 we show that both of considered problems are equivalent to each other in 
terms of optimal solution. We conclude the paper in Section 6. 

2 Problems Formulation 

There is given a set of n independent and non-preemptive jobs to be scheduled 
on m identical parallel machines. Each machine can process only one job at a 
time. We assume that machines execute the jobs without idle times, however, 
we allow a time gap between the beginning of the optimization process and 
the start of the execution of the first job on any machine. For each job its 
processing time pj and a due interval (d' = ki;dj = /C2) are given, where k\ 
and k2 (ki < k2) denote common due interval parameters. The first problem 
(PI) is to find a schedule tt (sequence of jobs on each machine) and values of 
the parameters k\ and ^2, which minimize the following criterion: 



/ (tt, A:i, k 2 ) + + ( 1 ) 

3 3 

where Ej = max{d'- - Cj, 0} - the earliness of job j, Tj = max{0, d!- -Cj) - 
the tardiness of job j and Cj is the completion moment of job j. 

The second problem (P2) is to find a schedule tt and values of the param- 
eters fei and fc2, which minimize the following criterion: 

p(7T, fci,fc2) = max |Pi max£'j,P2(fc2 — fci),P3 m^T^| , (2) 

where Pi > 0 , P2 > 0 and P3 > 0 are constant weights. 

3 Properties of Problem PI 

In this section we show some properties of Problem PI. In the following 
considerations we assume that Cmin = niin^ Cj and Cmax = max^ Cj. 

Property 1. In Problem PI, for any schedule tt, the optimal values of the 
parameters ki and k2 are equal to k^ = Cmin and k^ = Cmax-, respectively. 

Proof. We consider the following two cases: ( 1 ) for any schedule tt and any 
value k2, the optimal value of fci is equal to k\ = k^+ e; (2) for any schedule 
7T and any value of fci, the optimal value of k2 is equal to ^2 = ^2 where e 
is some positive or negative value. We show how to prove case (1) with e > 0. 
The remaining cases are very similar and will be omitted. 

Let /(tt, fci, ^2) and /(tt, , k2) denote values of the criterion ( 1 ) obtained 

for the parameters k[ = k^ e and k^ = , respectively. We have: 

/(tt, k[,k2) = max{d'- - Cj,0} + (^2 - ^i) + Ej 'I'j > 

max{fci - CmimO} + {k2 - fc'i) + Ej Tj = max{fci + e - Cmin,0}+ 

{k2 - kl -e) + Ej Tj = {k2 - k{) + Ej Tj = /(tt, k'{, k2). 





219 



This result contradicts the assumption that k[ is optimal. For the remaining 
cases we obtain similar results, which ends the proof. □ 

Prom Property 1 follows that in any schedule tt, for the optimal values of 
fc* and there are no early and no tardy jobs. Thus, we have: 

/(tt, kl , kl) = XI + (^2 -K) + Y1 (3) 

3 3 

Property 2. In optimal solution to Problem PI, the jobs executed as first on 
each processor complete at the same moment. 

Proof. Assume that tt is an optimal solution to Problem PI and Cmin is the 
completion moment of the jobs executed as first on each processor except for 
processor r G {1, ..., m}. The completion moment of the first job executed on 
processor r is equal to Cmin ~ where e > 0. Assume also that in schedule 
TT* the first job executed on processor r completes at Cmin- Now we show, 
that the criterion (1) for schedule tt' is not greater than the criterion for 
the schedule tt. We will use the following notations: CTr^min - the completion 
moment of the first job on the rth processor in schedule tt, max - the 
completion moment of the last job on the rth processor in schedule tt, C^rmax 
- the maximum completion moment in schedule tt. 

There are only the following two exhaustive cases: (1) the jobs executed on 
the rth processor complete e time units later and it doesn’t cause the change 
of maximum completion moment in schedule tt, i.e., C^rmax > C^Tr^max + 

(2) the jobs executed on the rth processor complete e time units later and 
it causes the change of maximum completion moment in schedule tt, i.e., 

Cti^ max ^ C-j^ max C^^^ max "b 

Case (1): It follows from expression (3) that the values of criterion (1) for 
schedules tt and tt' are, respectively, as follows: /(tt, ki , ^2) = Ct^ max—C^Tr^ min? 
/(tt', A:i,fc2) = c^max ~ (C'.r, min + ThuS, /(tT, fci , fc2) ~ /( 7 r',fci,fc 2 ) = S, 
which contradicts the optimality of the schedule tt. 

Case (2): If CTr^max < C'tt max < max + ^ then Criterion values for 
schedules tt and tt' can be estimated as follows: /(tt', k\ , fc2) = Ct^^ max — 

C^TTr niin VOr epsilon — CTr^max min ^ max “ C^TTr- min ~ /(^? A^l , Aj 2 ) , 

which contradicts the assumption that schedule tt is optimal. 

Notice that the cases considered above include also the case where the 
first job executed on one processor completes e time units later than the first 
jobs on remaining processors. □ 

Property 3. In the optimal solution to Problem PI, jobs completed as first 
on the processors have the largest processing times. 

Proof. Assume that in the optimal solution tt, job s with the processing time 
Ps is scheduled as first on one of the processors. From Property 2 follows that 
the completion moment of job s is equal to Cmin- Without loosing generality 
we assume that Cmin = Pmax^ where Pmax = maxjPj. Assume also that 




220 



job t is scheduled on any position except the first one on any processor and 
Pt = Ps+ Denote by tt' the schedule in which jobs s and t are exchanged. 
In the schedule tt' job t has the completion moment Cmin (according to 
Property 2). It follows from the expression (3), that the criterion value (1) of 
the schedules tt and tt' can be estimated as follows: /(tt', fei, ^ 2 ) = Ctt' max — 
^min ^ ^Trmax C^min — f f ■> ^ 2 ) t since C-jr' max ^ ^Trmax* This result 
contradicts the assumption that schedule tt is optimal. □ 

4 Computational Complexity and Solution Approaches 
for Problem PI 

Theorem 1. Problem PI is strongly NP-hard. 

Proof. Form Property 3 follows that in optimal solution to Problem PI, the 
first m jobs with the largest processing times are scheduled as first on par- 
ticular processors and have common completion moment Cmin — Pmax- The 
remaining jobs should be scheduled so that the maximum completion moment 
Cmax is minimized (see expression (3)). Thus, Problem PI is equivalent to 
the well known P\\Cmax scheduling problem (for n — m jobs), which is known 
to be strongly NP-hard in general case [4]. □ 

In the literature there are many approximation as well as optimal al- 
gorithms solving the problem P\\Cmax- For example in paper [5] authors 
propose the LPT (longest processing time) rule with the worst case ratio 
Rlpt <1-3^- In the paper [4] the pseudopolynomial time optimal algo- 
rithm is proposed and authors of paper [9] propose the fully polynomial time 
approximation scheme for problem P2\\Cmax’ After some modifications all 
mentioned above methods can be used to solve the considered Problem PI 
with the same quality of solutions. Modifications concern optimal scheduling 
of m jobs with the largest processing times, according to Property 2 and 
Property 3, and determining optimal values of parameters k\ and k 2 accord- 
ing to Property 1. 



5 Equivalence of Problems PI and P2 



In this section we show that the optimal schedule of jobs for the Problem PI 
is equivalent to the optimal schedule of jobs for the Problem P2, however, at 
first we present the following lemmas. 



Lemma 1. In any schedule of jobs for Problem P2, the optimal values of 
the parameters ki and k 2 fulfill, respectively, the following inequalities: kf > 
Cmin } k^2 ^ Cmax • 



Proof. The proof is very similar to the proof of Property 1 and therefore will 
be omitted. □ 




221 



It follows from Lemma 1, that the criterion value (2) is equal to: 

g{n,ki,k2) = max{Pi(fci - Cmin),P2{k2 - kx),Pz{Cmax - ^ 2 )}- (4) 

The following two lemmas hold for the criterion (4), for fixed schedule tt. 
Those lemmas also hold for a general function: 

/i(u, u, w) — max{Aiu, A 2 V, A^w} (5) 



subject to: 



u + V + w = A^ 

where u, v and w are some nonnegative variables, A is given 
constant and Ai, A 2 and A3 are given nonnegative weights. 

Lemma 2. The value of the expression (5) is minimal if Aiu = 



( 6 ) 



nonnegative 



A 2 V = A^w. 



Proof The proof is trivial and will be omitted. 

Lemma 3. The optimal values of the variables u, v and w, which minimize 
the function (5) are as follows: u* = -j x.M+AfXt+A^M ’ = A^A,+AX+A,A^ ’ 

— 4 . 1 ^2 A 

AiA2-\-AiAs-\-A2A3 ' 

Proof The proof can be easily derived form Lemma 2 and constraint (6) and 
will be omitted. □ 

It follows from Lemma 3 and the expression (4), that the criterion value 
(2) can be described as follows: 



^(7r,fci, A:2) 



P 1 P 2 P 3 



P 1 P 2 -f- P 1 P 3 + P 2 P 3 



{Cmax Cminf 



(7) 



Form Lemma 3 we obtain the following property. 

Property 4- In Problem P2, for any schedule tt, the optimal value of the 
parameter ki is equal kf = (Cmax ~ C'min) + Cmin and the 

optimal value of the parameter k 2 is equal kf = {Cmax — 

^min) Cmin- 



Proof The optimal values of the parameters ki and ^2 can be easily derived 
from Lemma 3. □ 



Theorem 2. The optimal schedule of jobs in Problem P2 is equivalent to the 
optimal schedule of jobs in Problem PL 

Proof From the expressions (3) and (7) follow that for any schedule tt and the 
optimal values of parameters fci and k 2 , the criterion value (2) is proportional 
to the criterion value (1): ^(Tr, kf,kf) = kl,k^). □ 

It follows from Theorem 2 that the considerations concerned the proper- 
ties of optimal schedule, the computational complexity, and methodology of 
solution which ware presented in Sections 3 and 4 for Problem PI are also 
true for Problem P2. The only exception is the method of determining the 
optimal values of parameters ki and k2 given by Property 1 for Problem PI 
and by Property 4 for Problem P2. 





222 



6 Conclusions 

In the paper, we considered two problems of scheduling jobs on identical 
parallel machines with a common due interval assignment. Some properties of 
optimal solution were provided for these problems. Based on these properties 
we proved that solutions of our problems are equivalent to solution of the 
well known P\\Cmax scheduling problem, which is known to be strongly NP- 
hard. Therefore we adopted some known methods of solution to solve our 
problems. The results obtained are very promising and motivating to focus 
on other scheduling problems with due interval assignment. 

References 

1. Cheng T.C.E., Gupta M.C. (1989). Survey of scheduling research involving due 
date determination decision. European Journal of Operational Research. 38, 
156-166. 

2. Chengbin C., Gordon V., Proth J.-M. (2002). A survey of the state-of-the-art 
of common due date assignment and scheduling research. European Journal of 
Operational Research, 139, 1-25. 

3. Chengbin C., Gordon V., Proth J.-M. (2002). Due date assignment and schedul- 
ing: SLK, TWK and other due date assignment models. Production Planning Sz 
Control. 13, 117-132. 

4. French S. (1982). Sequencing and Scheduling: An Introduction to the Mathe- 
matics of the Job-Shop. Horwood, Chichester. 

5. Graham R.L. (1969). Bounds on multiprocessing timing anomalies. SIAM J. 
Appl. Math. IT 263-369. 

6. Janiak A., Marek M. (2001). Multi-machine scheduling problem with optimal 
due interval assignment subject to generalized sum type criterion. Operations 
Research Procedings 2001. Springer, 207-212. 

7. Liman S.D., Panwalkar S.S., Thongmee S. (1996). Determination of common 
due window location in a single machine scheduling problem. European Journal 
of Operational Research. 93, 68-74. 

8. Liman S.D., Panwalkar S.S., Thongmee S. (1998). Common due window size 
and location determination in a single machine scheduling problem. Journal of 
the Operational Research Society. 49, 1007-1010. 

9. Sahni S. (1976). Algorithm for scheduling indenpendent tasks. J. Assoc. Comput. 
Mach. 23, 116-127. 





A New Exact Resource Allocation Model with 
Hard and Soft Resource Constraints 



Ferenc Kruzslicz 

University of Pecs, Department of Business Informatics, 
Rakoczi lit 80, H-7623 Pecs, Hungary 



Abstract. In this paper we present a new bilevel resource allocation model with 
hard and soft resource constraints for projects. By definition, a hard resource con- 
straint is not resolvable within the given planing horizon, but a soft resource conflict 
may be managed by a flexible ” hiring-firing” strategy. 

According to the definition, firstly we solve a ” makespan-minimization” problem 
for the hard resources. After that, we solve a ’’resource balancing” problem on the 
set of ’’hard resource feasible” schedules. In the ’’well-balanced” schedule searching 
phase we characterize the soft resource requirements with a new bicriteria measure, 
namely with the ’’peak resource requirement” and the ’’idle time”, simultaneously. 
In the proposed approach, in the efficient solution searching phase we applied a new 
a ’’trees in tree” like implicit enumeration method with effective pruning rules. 

The practical interpretation of the proposed model is demonstrated in an anal- 
ysis of a simplified small-scale business software development environment. 



1 Introduction 

The model introduced in this article is designed to schedule complex multi- 
resource, multi-project problems with wide range of practical applications. 
The word multi stays here in its wider sense since both resources and projects 
may differ substantially from each other [2]. Classification of resource types 
is based on the flexibility level of resource availability. According to the NP- 
hard nature of the problem [1], the model provides exact solutions for small 
to medium size problems. 

2 The model 

Let P = {Pi, P 2 , ...Pn} denote the set of projects to be scheduled. Every 
Pi project consists of ni different tasks Pi — {Tij : j = 1 . . .Ui}. Tasks are 
atomic activities with known resource and time requirements. Additionally 
there are Hi = {Ti immediate precedence relations for every Pi 

project. If activity Tij^ must be finished, before activity Tij^ is started, it is 
written as Tij^ Tij^. 

We have k different classes of resources with hard constraints, denoted 
by RH = {RH^, RH^, ... RH^}. In similar manner we can define the same 
set for resources with soft constraints RS = {RS^, RS^, . . . RS^}. For every 




224 



R G RH U RS resource a R{t) resource limit function is given that represents 
the available amount of resource R at time t. 



Definition: Resource R is referred as hard constraint if any time t no more 
than R{t) amount of this resource can be allocated. 

In case of hard constraints no auxiliary resources can be applied, not even 
in short term base. When an allocation conflict is caused by a hard constraint, 
it can be resolved only with rescheduling the activities, or in the worst case 
extending the earliest flnishing time of the project. 

Definition: Resource R is referred as soft constraint if any time t more than 
R{t) amount of it can be allocated by applying extra auxiliary resources. 

With these notations every activity can be described with a Tij = {Dij, 
RHlj.RHf j , . . . RH^jj RSljjRSf j , . . . RSlj) vector, where Dij denotes the 
amount of time, and Rij the amount of resource units required by activity j 
of project i for all R G RH |J RS. 

A schedule S = {Sij : i = 1 . . .n, j = 1 . . .nj of projects is given with 
start times of Tij activities. For convenience let us introduce two more To and 
Ti dummy activities which represent the beginning and the completition of all 
projects. To becomes the first and Ti the last activity of all schedules without 
any time or resource requirement. The following extra immediate precedence 
relations are added to these dummy tasks: H = {To 
i = 1 . . .n,j = 1 .. .Hi}. With this notations the So start time of To is the 
start time of the projects and similarly S\ the start time of activity Ti is the 
earliest flnishing time of P. 

We can assume that every schedule starts at time 5o = 0. With this 
assumption E{S) = Si is the total implementation time of all the projects 
in P by the schedule S. Throughout in this model only those schedules are 
considered that preserve the immediate precedence relations. 

Definition: Schedule 5 is said to be feasible if inequality Sij^ < Sij^ 

holds for every Tij^ -4 Tij^ immediate precedence rule. The set of all 
feasible schedules of projects P is denoted by 5R = U U ••• U ^n)- 



If resource constraints are taken into consideration the set of feasible 
schedules has to be restricted. Only those schedules are acceptable that do 
not overrun the resource limit of hard constraints. Let us denote U^{t) the 
utilization of resource R at time t by the schedule S. The function U^{t) can 
be expressed by the following formula: 



n rii 



if Sij <t< Sij + Dij 
otherwise 



c'f w = E E Sij{t) • Rij .where Sij{t) = | q’ 

i=l j=i ^ ’ 

Definition: Schedule S is said to be resource-feasible for a C set of resources 



if 5 G 3? and Us(t) < R{t) holds for all t G [0; E{S)] and ReC. moreover 
it has the earliest finishing time E{S) = mm{E{S^) : 5' G 5i}. The set of 
resource-feasible schedules for a C set of resources is denoted by 





225 



It is clear that set contains only schedules which are resource-feasible 
for all hard constraints. This set can be determined by solutions of a resource 
allocation problem (RAP). Henceforward we are interested in schedules in 
that are efficient for soft constraints as well. There are several well 
known scalar indicators of efficiency of a schedule. In this model we use a 
new global bicriteria measure which tracks the ’’peak resource requirement” 
and the ’’idle time” at the same time. This means locating Pareto optimal 
solutions. 

On the one hand the utilization of a resource R G RS is optimal during 
a schedule S if minimal extra resource has to be applied. In other words 
we have to minimize the peak resource utilization level. Our first type of 
objective functions can be formulated as 

MU^ = min {MU^iS)} , where MU^{S) = m^{Ui{t)} 

On the other hand the utilization of a resource R G RS is optimal during 
a schedule S if its fluctuation is minimal. The concept of idle times is based on 
a global concavity measure on histograms introduced in [5]. Let the overline 
operator U denote the quasi-concave hull of utilization histogram U. With 
this notation our second type of objective functions can be expressed as 

E{S) 

IT^ = mjn {IT^(S)} .where IT^{S) ^ (u^{t) - C/| 

^ t=o 

This kind of histogram concavity measure perfectly describes the require- 
ment of smooth resource utilization. 

Definition: A hard constraint feasible 5 G schedule is Pareto optimal 
on MU^ and IT^ objective functions of i? G RS resources if there is 
no other 5' G schedule where both MU^{S) > MU^{S') and 

IT^{S) > IT^{S') hold for all R G RS resources and at least one of the 
inequalities is strict. 




3 The algorithm 

The algorithm almost trivially follows from the model. This algorithm con- 
sists of two phases, first a ’’makespan-minimization” problem is solved for 
the hard constraints, then a ’’resource balancing” problem on the set of hard 
constraints feasible solutions of of the first phase [8]. 

Phase 1: In this phase the set is generated while the R G RS soft 
constraints are neglected. Using the Critical Path Method (CPM) we may 
get schedules 5 G 3? that cause resource allocation conflicts. Active tasks 
at time t may require more hard resources to be allocated than available. 
First these kind of hard resource violating schedules are repaired in this 





226 



phase. Problematic schedules are characterized by their minimal violating 
sets. A violating set of activities of S is called minimal if all the activities 
together cause resource allocation problem but any real subset of them can be 
realized without conflict. Violations are resolved by delaying activities with 
not enough resource. The makespan-minimization calculates new schedules 
from the violating ones by adding new repairing immediate precedence rules 
between the critical activities. When no hard resource feasible solution meets 
the deadline determined by the CPM process, the earliest flnishing time is 
increased until a hard resource feasible solution is found. 

Phase 2: In the second phase each schedule of are taken in turn. 
It is clear that only the leaf nodes of the search tree used in phase 1 should 
be analyzed further. These leaf nodes are splitted into subnodes by adding 
extra resource conflict resolver precedence rules to the original set H. After 
splitting nodes several new resource leveling problems (RLP) arise where hard 
constraints can be safely ignored since any feasible schedule of problems in 
phase 2 is a hard resource feasible schedule of the corresponding problem of 
phase 1. Efflcient schedules of the new RLPs may be determined with the 
same implicit enumeration method that was used in phase 1 as well. Since 
there are no conflicts on soft constrained resources the size of the search trees 
are notable larger. The exact solutions of the whole problem are those Pareto 
optimal schedules that minimize all the MU^ and IT^ functions of i? € RS 
soft constraints. 

Selecting a suitable implicit enumeration method is the key point in our 
algorithm. Prom the wide range of available methods [3] the algorithm pub- 
lished in [6] was found to be a good selection, since it is based on effective 
pruning rules that makes our algorithm capable to solve larger problems. 

4 Application 

In software development the life cycle deflnes the phases and tasks that are 
essential to systems development, no matter what type or size system is de- 
veloped. These techniques and methodologies have a common ’’plan, execute, 
and test” sequence of tasks to go through. In the problem presented here we 
assumed that the number of software designers is a hard constraint within 
the given planning horizon, but the availability of programmers and testers 
are soft constraints, which can be resolved by ’’hiring-firing” in short term 
base. In this example we exploit the fact that in a small-scale business soft- 
ware development process the usual iterative ’’waterfall” structure might be 
replaced by a serial ’’designer-programmer-tester” chain. 

The development team consist of five designers five programmers 
RS^ and five testers RS‘^. The level of available resources are constant R{t) = 

5 during the development process and every activity requires only one type 
of resource allocated to. In Figure 1 the project structure is represented with 
a simplified notation. 





Ill 



Projects] D / RH^ 


D / RS* 


D/RS^ 


rh'- 


7 


7 










New feature development 








PI 


2/2 


3/2 


4/2 








5 ; 








P2 


5/3 


3/3 


4/4 

















P3 


3/2 


3/2 


4/1 










3 


3 






Bug f 


ixing 


















P4 


0/0 


2/ 1 


2/2 
















P5 


0/0 


4/3 


2/3 


0 


1 


3 


3 


4 




« 7 S 9 10 1 1 JJ 


Extensive system testing 


HI 


iHir 




\ ^.3 1 


P6 


0/0 


0/0 


2/3 






P7 


0/0 


0/0 


3/3 










1 |{ tu 



Available resources 



I Person | 5 | 5 | 5 ~| 

Corresponding search tree 

Oo. {) {1.{T,.,.T2 .i.T3.i}.1} 1 

-I 1.{T,,,^T3.,} 2 
4 2 . 2 



T 3.] I T] J ] T 



Ft U t , 




1 Tt 


T 1 


1 




i 1 

! T . i 



Fig. 1. Time and resource requirements of projects in a software development pro- 
cess and extended Gantt chart of activities in the initial CPM schedule. 



The utilization histogram of resource RE^ in Figure 1 traces out the con- 
flicts of the initial schedule. Since there is no conflict free solution regarding 
hard resources in 12 days the earliest finishing time has to be delayed. With 
1 day enlargement of time horizon in phase 1 the number of feasible sched- 
ules grows from 27 720 000 to 731 808 000 which illustrates the complexity 
of this small example. In 13 days two basically different hard resource feasi- 
ble solutions can be found. New immediate precedence rules correspond to 
the branches of the search tree, see Figure 1. Only case of adding the rule 
Ti,i -> Ts^i is investigated hereinafter. The other case needs similar calcula- 
tions therefore it is omitted here. 

The Pareto optimal schedule shown in Figure 2 dominates the ones got 
from the other case, therefore this Pareto optimal solution is the unique 
optimum. This optimum makes a smooth workload of all resources, except 
two testers should be hired for the 9th day. 

Computational work was done on a PC with P166 Mhz processor. The 
program [5] has been written in Visual Basic Version 6.0, but critical parts 
were compiled as C++ DLL files. 



Table 1. Program execution details 



Phase 


Number of feasible 
schedules 


Number of nodes 
in the search tree 


Execution time 
in seconds 


Phase 1. Node 0. 


27 720 000 


3 


0.150 


Phase 2. Node 1. 


1 386 000 1 


282 


1.382 


Phase 2. Node 2. 


831 600 


150 


0.621 






228 



H' = if U{Ti,i Ta.i} MU^ = 5, MU^ = 7, 7T* = 7T* = 0 



o 



0 


t 3 


3 4 


5 « 7 


$ 4 11 13 


13 


□ 




h- -1 


h.3 


hH 1 


□ 












[ 


liiJL 




F 


1 



i 



"i.. ! H 



i} 



















■7 


















5 


5 


i 


S 


s 






J 




3 





















0 I 3 1 4 S 6 7 » V IQ l] IJ 




Fig. 2. Hard resource feasible schedule on node 1 of the search tree after the first 
phase and its Pareto optimal solution after the second phase. 



References 

1. S. E. Elmaghraby (1995): Activity nets: A guided tour through some recent 
developments. Eur. J. of Operational Research 82, 383-408 

2. V. Shankar, R. Nagi (1996) A flexible optimization approach to multi-resource, 
multi-project planning and scheduling. 5th Industrial Engineering Research Con- 
ference, Minneapolis MN, 263-267 

3. M. R. Zamani (2001): A high-performance exact method for the resource- 
constrained project scheduling problem. Computer &: Operations Research 28, 
1387-1401 

4. B. D. Reyck, W. Herroelen (1999): The multi-mode resource-constrained project 
scheduling problem with generalized precedence relations. Eur. J. of Operational 
Research 119, 538-556 

5. G. Csebfalvi (2001): Optimal resource leveling models for activity networks. 
Habilitation dissertation, PTE Pecs 120p 

6. G. Csebfalvi (1998): A fast exact solution procedure for the multiple resource- 
constrained project scheduling problem. Proc. APMOD ’98 Extended Abstracts, 
Limasol, Cyprus, 11-13 

7. G. Csebfalvi, P. Konstantinidis (1999): A new resource leveling procedure for 
the multiple resource-constrained project scheduling problem. Decision Sciences 
Institute 5th International Conference, Athens, Greece, 1723-1725 

8. G. Csebfalvi (2000): A new multi-criteria resource leveling procedure for the 
multiple resource-constrained project scheduling problem. Proc. EURO XVII. 





Minimizing Totai Weighted Tardiness on Paraiiei 
Batch Process Machines Using Genetic 
Aigorithms 

Lars Monch‘*, Hari Balasubramanian^, John W. Fowler^ Michele E. Pfimd^ 
^Institut fur Wirtschaftsinformatik , Technische Universitat Ilmenau, Germany 
^Department of Industrial Engineering, Arizona State University, USA 



1 introduction 

Recently, the electronics industry has become the largest industry in the world. A 
key aspect of this industry is the manufacturing of integrated circuits. In the past, 
sources of reducing costs were decreasing the size of the chips, increasing the wa- 
fer sizes and improving the yield, simultaneously with efforts to improve opera- 
tional processes inside the wafer fabrication facilities (wafer fab). Currently, it 
seems that the improvement of operational processes creates the best opportunity 
to realize the necessary cost reductions [6]. 

This research is motivated by a scheduling problem found in the area of diffu- 
sion in a wafer fab. In the diffusion area, there are several diffusion furnaces 
which can be operated in parallel and can process several jobs simultaneously. 
However, due to the different chemical nature of the processes, only jobs of the 
same job family can be processed together. Carefully choosing which products to 
batch together and which order these jobs should be sequenced is of high impor- 
tance as the processing times in the diffusion furnaces can take up to 10 hours, 
versus a 1 hour processing time in other wafer fab processes. Therefore, a proper 
scheduling of the machines dedicated to the diffusion process steps has a great 
impact on the performance of the entire wafer fab. Given that many wafer fabs are 
driven by on-time delivery, the performance measure of interest is the total 

weighted tardiness. This performance measure is given as ^WyTjj , where we 
denote by Wy the weight or priority for the job j of family i . The tardiness of 
job j of family i is given by Ty :=max(cy -dy,o), where we denote by Cy the 
completion time for job j of family i in the schedule and dy represents the due 
date for job j of family i . 

In this paper, we extend previous research on single batch process machines 
[2,3,4] to the more general case of parallel machines. We use a genetic algorithm 
in order to assign batches to the parallel machines. We reduce the problem to sin- 
gle machine sequencing problems after performing this decomposition step. For 



* To whom correspondence should be addressed. 




230 



the efficient solution of the single machine problems we use the same techniques 
as suggested in [2,4]. 

The paper is organized as follows. In the next section, we describe the problem 
under consideration in detail. Then we describe a four-phase-scheduling algorithm 
for obtaining the solution of the problem. We present results of computational ex- 
periments in the last section of this paper. 



2 Problem Description 

By Pinedo (2002), scheduling problems can be represented in the form a | p | 7 . 
The a field describes the machine environment (single machine, parallel ma- 
chine, job shop, etc.), the (3 field describes the process characteristics, restrictions, 
constraints (such as release dates, batch, set up dependent operations), and the 7 
field contains the information on the performance measure being considered. For 
the problem being researched in this paper the notation is: 

Pm I batch, incompatible | ^ WjTj . 

There are f families of jobs and m parallel machines. Family i has n^ jobs. The 
jobs that are waiting to be processed must be batched by family, depending on the 
maximum batch size B. Each single job has a weight w jj and a due date dy . Our 

objective is to minimize the total weighted tardiness of the jobs. 



3 Four-Phase-Scheduling-Algorithm 

Since the problem is NP-hard (cf. [1] for details) we come up with a heuristic ap- 
proach in order to solve it. We suggest a four-phase scheduling algorithm: 

• formation of the initial batches in the first phase, 

• assignment of the batches to the parallel machines in the second phase, 

• solution of the resulting single machine sequencing problems of the second 
phase in the third phase, 

• improvement of the objective function value using a swap procedure in the op- 
tional fourth phase. The objective is a modification of the batches after they 
have been scheduled. 



3.1 Forming Initial Batches 

We use two different heuristics to form the batches. The first one is based on the 
EDD (Earliest Due Date) dispatching rule. It can be described as follows. For 
batching the EDD heuristic orders the jobs in descending order of the due dates for 
each family and then form full batches (except for possibly the last one of each 
family) starting with the smallest due date. The Apparent Tardiness Cost (ATC) 





231 



dispatching rule is alternative to EDD for minimizing total weighted tardiness 
[7,5]. We use the ATC rule in the following way: 

1. We calculate for each job j of family i the ATC index given by 

f maxfdij -Pij -t,o}^ 



W:; 



p«j 






, where we denote by k a look-ahead 



parameter that is varied between 0.5 and 5.0. We use the notation p for aver- 



age processing time of the remaining jobs. 

2. The time t is the time when any of the available machines becomes free. At 
this time t, all available jobs in each family are ordered in descending order of 
their ATC index. The first B jobs are batched together in each family. Among 
the different batches the batch with the highest sum of ATC indices (called 
BATC index) is chosen. 

Note, that the ATC dispatching rule in this form (called ATC-BATC rule) can 
be used to solve the parallel batch process machine problem independently, i.e., 
without the other phases of the suggested algorithm. We use this heuristic in order 
to evaluate and compare the performance of our more sophisticated algorithm. 



3.2 Assignment of Batches 

We use a genetic algorithm (GA) in order to assign the initial formed batches to 
the parallel machines, evaluate each of the assignments by aggregating the total 
weighted tardiness of sequences (obtained on each machine by using sequencing 
heuristics for the single machine case) on all machines. The GA converges to- 
wards assignments that give good solutions. The overall framework of the genetic 
algorithm is shown in Figure 3.1. 

We used a batch-based chromosome representation in the GA. A solution of the 
problem to assign batches to machines is an array, whose length is equal to the 
number of batches. The s-th element of the array represents the machine that is 
used to process batch s. Each chromosome is randomly initialized by generating a 
random number from {l,...m}for each single gene. A standard one-point crossover 
operator is used for performing the crossover operations. According to a prede- 
fined probability, a gene of a chromosome is randomly changed to a different ma- 
chine number for mutation purposes. Each chromosome is evaluated with the total 
weighted tardiness value. After the batches are assigned to machines, each ma- 
chine is sequenced and the sum of the total weighted tardiness values of all ma- 
chines is used as the fitness function of the GA. 

A steady state GA with overlapping populations is used (cf. [8]). The GA is 
controlled by a prescribed number of populations and by using a diversity meas- 
ure. For parameter setting of the GA we refer to [1]. 





232 




Fig. 3.1. Overall Framework of the Genetic Algorithm 



3.3 Solution of the Single Machine Sequencing Problems 

For sequencing the batches on each single machine we use the EDD rule, the 
BATC rule and a sequence decomposition heuristic suggested by Mehta and 
Uzsoy [3]. From an initial sequence obtained by a heuristic a sub-sequence of size 
X is solved into an optimal sequence (with respect to total weighted tardiness) by 
using total enumeration. The first a batches of this optimal sequence are fixed 
into the final sequence, and the next X batches are considered. This process is re- 
peated until all batches are covered and no improvement is made on the perform- 
ance measure. We refer to [4] for a more detailed description of these heuristics. 



3.4 Swapping Procedure 

We use the swap procedure suggested in [2] in order to improve the quality of the 
schedules. The original swap procedure exchanges jobs for all batches of one fam- 
ily across a single machine with the objective to reduce the performance measure. 
We propose an extension of this technique to the parallel machine case by redis- 
tributing jobs not only across compatible batches of the same family but also 
across the parallel machines. 







233 



4 Results 

In this section we present results of computational experiments. We used test data 
generated according to an extension of the test data description given in [4] to the 
parallel machine case. The test cases depend on the batch size, the number of par- 
allel machines, the number of jobs of each job family, the due date range R and 
the tightness T of the schedules. In our experiments we considered the case of 
three different job families. 

All experiments were performed on a Pentium III, 800 MHz PC. We used the 
object-oriented framework GaLib (cf. [8]) in order to implement the suggested GA 
in an efficient way. The average computing time of the GA was one minute per 
problem instance. The results of the experiments are given in Table 4.1. We show 
the ratio for the performance measure value of the suggested variant of the four- 
phase-scheduling algorithm to the performance measure of the ATC-BATC- 
heuristic. Clearly, a couple of configurations of our algorithm outperforms the 
ATC-BATC-heuristic. As can be seen from the results the algorithm is sensitive to 
the due date range R and the tightness T of the schedules. 



5 Conclusions and Outlook to Future Research 

In this paper we suggested a four-phase heuristic algorithm for the solution of par- 
allel batch process machine scheduling problems. We do not know how close we 
are to an optimal solution. Therefore, one direction of future research can focus on 
the determination of lower bounds for total weighted tardiness. Another direction 
of future research is to be focus on the incorporation of ready times of the jobs 
into the algorithm. 



Acknowledgement 

The authors gratefully acknowledge the support of the Semiconductor Research 
Corporation (2001-NJ-880). Parts of this research was carried out whilst the first 
author was visiting the Modeling and Analysis of Semiconductor Manufacturing 
Laboratory at the Arizona State University, Department of Industrial Engineering, 
Tempe. 





234 



Table 4.1. Results for Different Configurations of the Algorithm 



Combinatioiis/ 

Characteristics 


HDD 

GA 

EDO 


ATC 

GA 

BATC 


ATC 

GA 

BATC 

swap 


ATC 

GA 

DH 

Swap 


Machines 




1,96 


0.98 


0.92 


0.92 


m=5 


1.86 


0.98 


0.93 


0.89 


Batch Sizes 










B=4 




0.99 




0.92 


B=8 


1.81 




0.92 


0.91 


Jobs/Family 


80 


1.92 


1.00 


0.91 


0.91 


100 


2.01 


0.99 
















Due Bate Range 


^ _ . 


R=0.5 


2.38 


0.96 


0.87 


0.85 


R=2.5 


1.47 


1.00 


0.98 


0.96 


Tightness 


T=0.30 


1.96 


0.97 


0.88 


0.86 


T=0.60 


1.89 


1.00 


0.97 


0.95 



References 

[1] Balasubramanian, H. 2002. Minimizing Total Weighted Tardiness on Parallel Batch 
Processing Machines with Incompatible Job Families. Master Thesis, Arizona State 
University, Department of Industrial Engineering. 

[2] Devpura, A., J. W. Fowler, M. W. Carlyle, I. Perez. 2000. Minimizing Total Weighted 

Tardiness on Single Batch Process Machines with Incompatible Job Families. Pro- 
ceedings Symposium on Operations Research, 366-371. 

[3] Mehta, S. V., R. Uzsoy. 1998. Minimizing Total Tardiness on a Single Batch Process- 

ing Machine with Incompatible Job Families. HE Transactions, 30, 165-178. 

[4] Perez, I., J. W. Fowler, M. W. Carlyle. 2002. Minimizing Total Weighted Tardiness on 

a Single Batch Process Machine with Incompatible Job Families. Submitted to Com- 
puter and Operations Research. 

[5] Pinedo, M. 2002. Scheduling: Theory, Algorithms, and Systems. Prentice Hall, New Jer- 

sey, Second Edition. 

[6] Schomig, A., J. W. Fowler. 2000. Modelling Semiconductor Manufacturing Operations. 
Proceedings of the 9 th ASIM Dedicated Conference Simulation in Production and Lo- 
gistics, eds.: K. Merlins and M. Rabe, 55-64. 

[7] Vepsalainen, A., T. E. Morton. 1987. Priority Rules and Lead Time Estimates for Job 
Shop Scheduling with Weighted Tardiness Costs. Management Science, Vol. 33, 1036- 
1047. 

[8] Wall, M. 1995. GaLib - A C++ Library of Genetic Algorithm Components, 
http://lancet.mit.edu/ga/. 











Sorting with Line Storage Systems 



Thomas Epping^’^ and Winfried Hochstattler^’^ 

^ Department of Mathematics, BTU Cottbus, 03013 Cottbus, Germany 
^ eppingQmath . tu-cottbus . de 
^ hochstaettlerSmath . tu-cottbus . de 



Abstract. We consider a problem that arises in car production. In particular, we 
focus on the sorting of car bodies with respect to their designated enamel colors for 
a paint shop. Our objective is to sort a given car body sequence so that the number 
of color changes within the sorted sequence is minimized. Current technology is to 
perform the realignment of a sequence by the use of a line storage system. 



1 Introduction 

Interim storage systems are an essential tool for production control. They 
are installed in front of production shops to arrange elements of an incoming 
sequence in a way that increases the efficiency of the particular production 
shop and thus reduces production costs. Several types of interim storage sys- 
tems exist (see [1]). Among them, line storage systems are the most popular 
due to their simplicity and low investment costs. 

We deal with line storage systems installed in front of the paint shop of an 
automobile plant. These systems are used to permute an incoming sequence 
of car bodies so that the number of color changes arising in the outgoing 
sequence within the enamel booth of the paint shop is minimized. 

Definition 1. Given a color set F and a color sequence / == (/i,---,/n) 
with fi G F for i = 1, . . . , n, we say that we have a color change within / 
whenever fi ^ /i+i- We denote the number of color changes within / by 
7(/)- 

A line storage system (see Figure 1) consists of a lookahead area L of 
length I that holds the next I elements of the input sequence, a set Qi, . . • , Qr 
of sorting belts, each of length g, and an output belt. We model the lookahead 
area as a stack, and each sorting belt as a queue. Different colors are repre- 
sented by different letters. Note, that L and each Qi is an ordered multiset. 

Two problems arise in this context. First suppose that we are given a 
snapshot of the line storage system. We focus only on the retrieval of al- 
ready stored colors, ignore the lookahead area and interpret each sorting belt 
as a stack. Then the problem consists in the retrieval of the colors on the 
stacks such that the number of color changes within the retrieval sequence is 
minimized. 




236 






PaintShop 







□ 


0 


0 



Output belt 



Sorting belts 




Retrieval rules 



Storage rules 



Fig. 1. Example of a line storage system 



Problem 1. Color Retrieval Problem (CRP) 

Instance A finite set 5i, . . . , 5r of stacks of length /i, . . . , that contain 
colors of a finite color set F. 

Question Retrieve the colors from 5i , . . . , in a sequence R such that 
j{R) is minimized. 

The second problem is an extension of the first one and includes the 
distribution of the I colors contained in L on Qi, • . . , Or together with the 
retrieval of all colors from Oi, . . . , Or such that the number of color changes 
arising on the output belt is minimized. Each Qi may initially contain colors, 
and storage and retrieval operations are allowed to take place in any order. 

Problem 2. Color Storage and Retrieval Problem (CSRP) 

Instance A finite set of queues, each having length and a 

stack L of length I that contain colors of a finite color set F. 
Question Store the colors in L on Qi, . . . , and retrieve all colors from 
Qi , . . . , Qr in an order that minimizes the number of color changes 
within the output sequence. 

We describe solution approaches to both problems in the next sections, 
using fairly standard notation. We consider Problem 1 first. 



2 The CRP 

We show that the CRP is in a certain sense equivalent to the multiple se- 
quence alignment problem known from molecular biology. This problem can 
be solved by a fast dynamic programming algorithm {A* algorithm). We give 
a short overview of multiple sequence alignment and describe our modifica- 
tions to the dynamic program (see [2] for more details on this section). 



2.1 Multiple Sequence Alignment 

Multiple sequence alignment (MSA) is used in molecular biology to detect 
transformations of DNA or protein sequences. We give a brief sketch of MSA 
(see [3]). Suppose that we are given r sequences with Oij G F, where F is any 





237 



CLi = (aiiai2 . . . aim) 
CL2 = (CL2ia22 • • • a2n2) 



a^ — (a7-ia7>2 • • • 

(a) 



A = 



( ^11 ^\2 • • ♦ ^ic \ 

^21 ^22 • • • ^2c 



\a^i a^ 



t2 






(b) 



Fig. 2. Illustration of multiple sequence alignment 



finite alphabet (see Figure 2(a)). We speak of an alignment of ai, . . . , a^. if we 
insert elements - ^ F into ai , . . . , such that we get sequences a'j , . . . , aj, 
that have the same length c with maxi{ni} < c < ^ • n^. In the following, we 
assume that an alignment of r sequences is given in the form of an (r x c)- 
alignment matrix A with € F U {— } (see Figure 2(b)). 

The MSA problem consists in the computation of an alignment matrix 
A that has a minimal value D{A) with respect to some scoring function D. 
It is NP-complete for the sum of pairs score (see Section 2.2). Any MSA 
instance can be solved by dynamic programming with a time complexity of 
0{2^nUrii) (see [4]). 



2.2 MSA and CRP 



We interpret each stack of Problem 1 as an input sequence for the MSA 
problem and want an optimal alignment matrix to give an optimal retrieval 
sequence for the colors on the stacks. 

First we assume that no stack contains a subsequence of two or more 
consecutive identical colors by merging any subsequence of that form into a 
single color element. Correspondingly, we assume that each color on a stack 
is framed by different colors. 

Proposition 1. The merging of consecutive identical colors does not in- 
crease the minimal number of color changes for an instance of the CRP. 

Now the main idea is to calculate an optimal MSA matrix A and to derive 
an optimal retrieval sequence R for the colors on the stacks by flattening A 
column by column into the sequence 



R _ (a^^a^i . . . a^iai20>22 • • • ^r2 • * • ^lc^2c • • • ^rc)'> 



assuming that each stack is open to the left. If we pass through R from the 
left to the right, we interpret an element = / as to withdraw color / from 
stack i and an element = — as to withdraw nothing from stack i. 

For the calculation of an optimal alignment matrix A we use the sum of 
pairs score 

i<j j=l 





238 



and define a distance function for two sequence elements x,y e FU {-} hy 

(0 ,ifx = y 

d{x, y) := I 1 ,\{x = -OTy = - 
[oo , if x^y 

These settings prevent that any column of an optimal MSA matrix A con- 
tains more than one color and force A to contain as few columns as possible. 

Lemma 1. Suppose that the solution of a CRP instance results in an optimal 
(r X c)-MSA matrix and a corresponding withdrawal sequence R. Then R 
contains y{R) = c-1 color changes. Conversely ^ we can reconstruct from R 
an MSA matrix with c columns and r rows, where r is the length of a longest 
subsequence of consecutive identical colors in R. 

Thus, after applying Proposition 1 and using the distance function defined 
above, our solution approach solves any instance of the CRP to optimality. 

Theorem 1. The solution of an instance of CRP is equivalent to the solution 
of an MSA problem as described above. The flattening of the resulting optimal 
MSA matrix yields an optimal withdrawal sequence R. 

Preliminary computational results on random instances attest the appli- 
cability of our solution approach in practice. For example, the solution of a 
random instance of CRP consisting of r = 6 sorting belts, each filled with 
q = 20 colors of an underlying color set of size |F| = 15, requires an aver- 
age running time of 5 seconds on a SUN E450 with 4 SunUltrall 400 MHz 
processors and 1152 MB memory, using the implementation described in [4]. 

3 The CSRP 

We now describe a dynamic programming approach for the solution of Prob- 
lem 2. We give a description of the dynamic program and mention some 
implementation aspects (see [5] for more details on this section). 



3.1 A Dynamic Program 

We denote a state of our dynamic program by 5 = (c; Qi, . . . , QrlP)? where 
c denotes the last color on the output belt, Qi, . . . , Qr denote the content of 
the queues, and 1 < p < / denotes the current position in the lookahead area. 
We set c = 0 if the output belt contains no colors, and p = I + 1 to indicate 
that L is completely processed. We denote the color at the lookahead position 
P by Lp. 

Given a state 5, we assume that new states can be created by applying 
the following two basic operations to every queue Qi of S (if possible): Store 
Lp on Qi (without moving any queue), or move Qi by one position. We 





239 



denote the set of new states obtained after the appliance of the first resp. the 
second basic operation to every queue of S by iV'^(S) resp. N^{S). Note, 
that |AT+(5)| < r and |A^^(5)| < r holds for every state S. 

Our basic operations admit gaps on the queues and the output belt. We 
represent gaps by an additional “color” /o ^ F. Thus a color / G F or /o 
drops on the output belt whenever we move a queue. For convenience, we 
ignore the artificial color /o on the output belt, as it does not affect the 
number of color changes. 

Proposition 2. Suppose that we are given a state S' = (c; Qi, • . • , Qr]p)- 

1. The set N^{S) is created as follows: Create a state 

S = (c; Qi^i ,...,^7.;/)-+- 1) 

with Q- = (/i, . . . , fq-i,Lp) for every queue Qi = {fi,..., fq-ijo) of S 
and add S' to AT+(S). 

2. The set N^{S) is created as follows: Create a state 

with Q[ = (/2, . . . , fq-iJo) for every queue = (/i, . . . , fq-ijq) of S 
and add S' to iV^(S). 

We denote the current minimal number of color changes on the output 
belt of a state S' by 7(S') and use the following proposition to compute 7(S'). 

Proposition 3. Suppose that we are given a state S'. The value 7(S') can 
be computed by 

l{S')= min {7 (Pi),7(^’2) + <5(F2,5')> 

jP2€iJr 



with P+ = {P :S' e N+(P)}, P^ = {P:S' e N^(P)}, and 6 defined by 



S(P2,S' 




0 , if c — 

1 , else 



c' 



or c = 9 



for P 2 = (c;Qi,...,<9r;p) and S' ^ (c'; Q'l, . . . , G N^{P 2 ). 

Thus, to compute 7(S') for a state S', we have to consider 7(S) of all 
states S from which S' can be created by the appliance of one of our two 
basic operations. In addition, if S' was created from S by the movement of a 
queue, we have to check whether an additional color change occured on the 
output belt of S' or not. 

We use Proposition 3 to formulate our dynamic program (see Figure 3). 
An initial state So = (0;Qi, • • • ,Qr; 1) has an empty output belt, initially 
filled queues Qi,...,Qr5 and a lookahead position of p = 1. A state S = 
(c; 0, . . . , 0; / H- 1) is completely processed. 

Lemma 2. The dynamic program depicted in Figure 3 solves an instance of 
the CSRP requiring a time complexity of 0{{\F\ + 1)^^ • 1F| • / • r) and a space 
complexity of 0{{\F\ 4- 1)'*^ • |F| • /). 





240 



1. Create an initial state So with 7(5o) = 0 and set P i- {So} 

2. While P 0 

Set Q -f- 0 
For all S € P 

For all S' G iV+(S) U iV""(S) 

Compute 7(S') as described in Proposition 3 

If S' is not completely processed, set Q i- Q\J (S') 

Else update best solution 
Set P Q 

3. Return best solution 

Fig. 3. The dynamic program for CSRP 

3.2 Implementation Details 

The huge theoretical complexity of the dynamic program can be drastically 
reduced in practice. Some non-obvious reasons for that are the admissibility 
to ignore gaps between colors on the sorting belts and on the lookahead, the 
avoidance of equivalent states with permuted queue sets by a lexicographical 
queue ordering, and the use of a suitable hash function for fast access to 
states. 

Nevertheless, it remains a significant running time dependence of the color 
frequencies and distribution on the queues and the lookahead, the queue and 
the lookahead size, and the size of the underlying color set. We therefore doubt 
the practical use of our dynamic program, although it may be applicable for 
simulation purposes. For example, the solution of random CSRP instances 
consisting of r == 3 sorting belts (each of length ^ = 10 and initially filled 
with 5 colors), a completely filled lookahead of size I = 8, and an underlying 
color set of size |F| = 4, already requires an average time of approximately 
400 seconds on a Pentium 3 PC with 900 MHz. 

References 

1. Wortmann D., Spieckermann S. (1995) Manufacturing Line Simulation of Au- 
tomotive Industry to Enhance Productivity and Profitability. In: M. R. Heller 
(Ed.), Automotive Simulation ’95, SCS-Society for Computer Simulation Inter- 
national, Erlangen, Germany, 91-106 

2. Epping Th,, Hochstattler W. (2001) Abuse of Multiple Sequence Alignment 
in a Paint Shop. Technical report zaik-200 1-418, Center of Applied Computer 
Science, University of Cologne, Germany 

3. Waterman M. S. (1995) Introduction to Computational Biology. Chapman Sz 
Hall 

4. Lermen M., Reinert K. (1997) The Practical Use of the A* Algorithm for Exact 
Multiple Sequence Alignment. Technical Report 97-1-028, Max-Planck-Institut 
fiir Informatik, Saarbriicken, Germany 

5. Epping Th., Hochstattler W. (2002) Storage and Retrieval of Car Bodies by the 
Use of Line Storage Systems. Technical report btu-lsgdi-001.02, BTU Cottbus, 
Germany 





On Solvability of the Project Scheduling 
Problem with Accumulative Resources 
of an Arbitrary Sign* 



Edward Gimadi^’^ and Sergey Sevastianov^ 

^ Sobolev Institute of mathematics, 630090, Novosibirsk, Russia 
^ Corresponding author: gimadi@math.nsc.ru 



Abstract. A project scheduling problem (PSP) with precedence constraints, dead- 
lines and resource constraints is considered. It was known before that PSP is polyno- 
mially solvable if all resource constraints are of so called accumulative type (please, 
don’t mix the notion with completely different notions cumulative and cumula- 
tives lately introduced in the literature and being nothing but a kind of renewable 
resource constraints!), provided that both resource availabilities and resource re- 
quirements axe non-negative. Now we show that PSP with accumulative resource 
constraints remains polynomially solvable if we allow resource availabilities of an 
arbitrary sign, while all resource requirements remain non-negative. On the other 
hand, it is shown that in the case of resource requirements of an arbitrary sign the 
problem becomes NP-hard in strong sense. 



1 Introduction 

For the resource constrained project scheduling problem (PSP) in parallel 
with renewable resources [3,4] a notion of accumulative resources was known 
since 1965 [8]. The PSP problem with this type of resource constraints was 
investigated in a number of papers (see, e.g., [5,7]) mostly written in Russian. 
It should be noted that the nature of the accumulative resource constraints 
drastically differs from that of renewable ones (which makes it possible to 
present polynomial time algorithms for solving quite general cases of the 
PSP problem under resource constraints of the accumulative type only, while 
the problem with renewable resource constraints is known to be NP-hard al- 
ready in very simple cases). Indeed, in the case of accumulative resource con- 
straints any quantity of resource allotted for unit time period t (^ = 1, 2, . . .) 
and non-utilized in that period can be consumed at any subsequent period 
t > t — unlike it happens with renewable resources, where any part of re- 
source non-consumed at time period t is just wasted. To some extent, money, 
expendables (with considerably long shelf life), and other materials may be 
considered as accumulative resources. The term “accumulative” means that 
at any time period t all resources allotted (and not consumed) for all previous 



Supported by the Russian Foundation for Basic Research (project 02-01-01153), 
INTAS (project 00-217), and “Russian Universities” (project UR.04.01.012). 




242 



time periods are at our disposal: we consume resources accumulated for the 
previous time period. 

It should be noted here that the terms cumulative and cumulatives re- 
source constraints introduced in OR literature quite recently (in 1993 [1] and 
2001 [2] respectively), while sound similar, have nothing to do with accu- 
mulative resource constraints, and in fact, are kinds of renewable resources. 
(Cumulative resource is just a special case of the latter, when there is only 
one renewable resource with constant in time availability, while cumulatives 
constraint extends the classical renewable resource constraints to the case 
of negative resource requirements — which may be treated as resource 
production — and to the case of resource constraints from below — when 
we need to cover a given amount of job specified in time.) Here the term 
“cumulative” relates to the total resource requirement summed up over 
all jobs being processed in a given unit time period f, — and one can easily 
observe that there is nothing new in this type of constraint, for it is typical for 
renewable resources. So, we find the notion “cumulative” redundant, while 
the term “cumulatives” is just illiterate, because “cumulative” is not a noun. 

It can also be seen that any non-renewable resource may be considered as 
a special case of an accumulative one, when the whole amount of resource is 
allotted in the first unit time period, while the resource allotted in subsequent 
periods is equal to zero. 

Known results. Let PSP"^ stand for the PSP problem with resource 
constraints of only the accumulative type. In [7] some basic properties of 
optimal schedules for the PSP^ problem were esrablished, which allowed the 
authors to elaborate a pseudopolynomial algorithm. 

In [5] for the case of real processing times a polynomial time asymptoti- 
cally optimal algorithm was suggested with an absolute error tending to zero 
for increasing number of jobs. 

Finally, a fully polynomial time algorithm was presented in [6] for solving 
the case of PSP'^ problem, when both intensities of allotment (further referred 
to as availabilities) of accumulative resources and resource requirements are 
non-negative. 

New results. We prove that PSP^ is still polynomially solvable, if nega- 
tive availabilities are allowed, while resource requirements should still remain 
non-negative. On the other hand, it is shown that if negative resource re- 
quirements are allowed, the problem becomes NP-hard in strong sense. 



2 Problem setting 

Let us first formulate a more general problem with resource constraints of 
two types: accumulative and renewable. 

We consider a project with a set of activities (jobs) J = {1, . . . , n}. Prece- 
dence constraints are given by a directed acyclic graph G = {V, U) (a so-called 





243 



reduction graph of a partial order) ^ where V = {t’l, • • . ,t^n} is a set of ver- 
tices that correspond to jobs, j G J, and [/ is a set of edges: an edge (vi^vj) 
belongs to U if and only if job i is a direct predecessor of job j. 

There are resource constraints of two types: accumulative and renewable. 
Corresponding sets of resources are denoted by /C^ and /C^ respectively. For 
each constrained resource fc G /C = /C^ U /C^ its available quantity qk{t) is 
allotted in the time interval [t-l,t) {qk{t) may be of an arbitrary sign). For 
each resource k £ )C^ and each time interval [t - l,t) the whole amount of 
resource allotted (and not consumed) in the interval [0, t) may be consumed 
in the time interval [^ - 1, ^). 

Processing time pj and resource requirements rjk{r) are given for each 
activity j G J, resource k £ 1C, and time interval [r — l,r) {r = 1, . . . ,pj) 
calculated from the starting point sj of the activity. 

Deadlines dj G are specified for all j G Jd C J. 

The objective is to compute a schedule S = {sj} that meets all resource 
and precedence constraints and deadlines and minimizes the makespan. 

A formal setting of the PSP problem is presented below. 

To minimize the makespan 

C'max(-5) = max(Sj +Pj) (1) 

under the following constraints on variables {sj \j G J}: 

Sj -l- Pj ^ dj, j G Jdj (2) 

Si+pi<Sj, i e Pred(j), j C J; (3) 

rjk{t- Sj) <qk{t), A: € a:®, f 6 Z+\ {0}; (4) 

ieJ{t) 

t t 

^ ^ A:6A:^ <€Z+\{0}; (5) 

t'=lj£J{t') f' = l 

Sj G Z+, j G J, (6) 

where J{t) — {j \sj < t < Sj -f- pj}, Pred(j) is the set of direct predecessors 
of job j. 

The problem (l)-(6) is known to be NP-hard if /C^ ^ 0. We will further 
concentrate on the case /C^ = 0 (so, we have )C = IC^). The corresponding 
PSP problem will be denoted by PSP^. 

Our positive results are formulated in the following section. 



3 Polynomial solvability of the PSP^ problem under 
resource availabilities of an arbitrary sign 

We assume that intensities of resource requirements and resource availabilities 
are given by sequences of pairs {{r^jk^P)k)^ i = I,..., Njk} and {{ql^al), i — 





244 



1, . . . , where i is the index of the ith interval of constancy of functions 
rjk{i) and qk{i) respectively, and a\ are the length and the starting point 
of the corresponding intervals. 

Let T be a positive integer. A schedule — {sj \ j G J} is called a 
TAate schedule if all sJ are maximum possible under constraints (2), (3), (6) 
and 

sJ + Pj <T, j £ J. 

Theorem 1. Problem PSP^ with nonnegative resource requirements admit- 
ting resource availabilities of an arbitrary sign can be solved in polynomial 
time 

0{{N~ + logpmax + logn)(u + N log f + N' log K)), 

where n is the number of jobs ^ 
u is the number of edges of the reduction graph G, 

f is the number of independency of graph G (the width of the partial order 
on the set of jobs), 

K is the number of constrained resources, 

N is the total (over all resources and all jobs) number of intervals of constancy 
of functions rjk{t), 

AT' is N plus the total (over all resources k ^ JC) number of intervals of 
constancy of functions qk{t)j 

N~ is the number of points t in which the functions qk{t) change their sign, 

Pmax — maxj^ J Pj . 

Let us briefly outline the polynomial time algorithm for solving the PSP^ 
problem under assumption that resource requirements are nonnegative. 

Algorithm A 

Preliminary calculation: 

Find the values of each function Qk{t) = fc G /C, at nodes 

t = ai, i = l,...,Nk. 

Calculate the values Rk = '^]kP)k 

resource fc G /C. 

Find the increasing order of (different) deadlines of jobs. 

Compute the length Ter of the shortest schedule 5', which is equal to the 
length of the longest (critical) path in graph G. 

Compute the minimal set of disjoint time intervals T = {[ai,bi]\l = 
1, . . . , L} containing the set 

T( = {tez+\t> T„, Rk < Qk{t), yk£lC}. 

(The last interval [ulj&l] may be infinite, i.e., bi = oo is possible. It is clear 
that L is no greater than the total number AT“ of points t in which the 
functions qk(t) change their sign.) 





245 



For / = 1, . . . , L do begin 
a ai] b := bi; 
if 6 = 00 then goto M; 
else if is feasible then goto M; 
end (for); 

output (“There are no feasible solutions”); STOP; 

M: T := min{6, a + Ter}; 
if is infeasible then 

{ output (“There are no feasible solutions”); STOP} 
else using dichotomy, find the minimal integer t G [a,T] for which the 
schedule is feasible, and output as the desired optimal schedule. 

4 NP-hardness of the PSP^ problem under resource 
requirements of an arbitrary sign 

Theorem 2. Problem PSP^ with resource requirements of an arbitrary sign 
is NP-hard in strong sense even in the special case, when there are no ac- 
tivities with deadlines; pj = \/ j E J; graph G represents a collection of 

disjoint chains, and there is only one accumulative resource with a constant 
availability. 

To prove the above statement, given an instance {ci, . . . , 63 ^} of the 3- 
PARTITION problem (X^e^ = kE] jE < ei < V i), we define the 
following instance of the PSP*^ problem. 

Graph G consists of (3fc -1- 1) disjoint chains Ci, . . . , Csk-\-i- Chain Ci {i = 
1 , . . . , 3fc) contains two vertices: -> v”, the resource requirements of jobs 

corresponding to vertices v[ and v” being equal to and — respectively. 
Chain Csk-\-i consists of k vertices, resource requirements of the corresponding 
jobs are equal to E. All jobs have unit processing times. Resource availability 
is constant in time, equal to E in each unit time period. 

It can be shown that the above instance of the PSP^ problem has the 
makespan Cmax(*?) = A: + 1, if and only if there exists a required 3-partition 
for the original 3-PARTITION problem. 

It can be easily seen that problem PSP^ remains NP-hard in strong sense 
if graph G consists of chains having length at most 2, while the resource 
availability function is not constant. 

5 Concluding remarks 

Concerning the above positive result mentioned in Theorem 1, it should be 
noted that in addition to its self-dependent significance (as an important 
polynomially solvable case of the PSP problem), it can also be utilised in 
another way: for computing a nontrivial lower bound on the optimum of the 
PSP in general case, since accumulative resource constraints can be consid- 
ered as a relaxation of renewable resource constraints. 




246 



References 

1. Aggoun, A., Beldiceanu, N. (1993) Extending CHIP to solve Complex Scheduling 
and Packing Problems. Mathl. Comput Modelling 17 (7), 57-73. 

2. Beldiceanu, N., Carlsson, M. (2001) A New Multi-Resource cumulatives Con- 
straint with Negative Heights. SICS Technical Report T2001:ll, Uppsala, Swe- 
den. 

3. Bottcher, J., Drexl, A., Salewski, F. (1999) Project scheduling under partially 
renewable resource constraints. Management Science 45, 543-559. 

4. Brucker, R, Drexl, A., Mohring, R., Neumann, K., Pesch, E. (1999) Resource- 
constrained project scheduling: Notation, classification, models, and methods. 
European J. Oper. Res. 112, 3-41. 

5. Gimadi, E.Kh., Puzynina, N.M., Sevastianov, S.V. (1979) On some optimisation 
problems arrising while realization of large-scale projects (such as BAM) (in 
Russicin). Ekonomika i mat metody 5, 1017-1020. 

6. Gimadi, E.Kh., Zaljubovsky, V.V., Sevastianov, S.V. (2000) Polynomial solvabil- 
ity of the project scheduling problem with accumulative resources and deadlines, 
(in Russian). Diskret Analiz i Issled. Oper. 7 (1), Ser. 2, 9-34. 

7. Kozlov, M.K., Shafranskii, V.V. (1977) Project scheduling under given availabil- 
ities of acumulative resources (in Russian). Izvestija AN SSSR. Technicheskaja 
kihemetika^ no. 4, 75-81. 

8. Zukhovitskii, S.I., Radchik, LA. (1965) Mathematical methods in project 
scheduling (in Russian). “Nauka”, Moscow. 




Cost Optimized Layout of Fibre Optic 
Networks in the Access Net Domain 



Peter BachhiesP, Gernot Paulus^, Markus Prossegger^, Joachim Werner^, 
and Herbert Stogner^ 

^ Carinthian Tech Institute, Department of Telematics and Network Engineering 
Primoschgasse 8, A-9020 Klagenfurt, email: p.bachhiesl@cti.ac. at 
^ Department of Geoinformatics Europastrae 4, A-9584 Villach 



Abstract. During the last two years European network-carriers have invested 7.5 
billions Euro in the expansion of the core and the distribution net domain (back- 
bones, city backbones and metropolitan area networks). However, investigations 
have shown that about 95% of the total costs for the implementation may be ex- 
pected for the area- wide realization of the last mile (access networks). In order to 
achieve a return on investment, carriers will be forced to link access net nodes, like 
corporate clients, private customers or communication nodes for modern mobile 
services (e.g. UMTS) to their city backbones. We present the methodology behind 
the planning tool NETQUEST-OPT for the computation of cost optimized and real 
world laying for fiber optic access networks. Bases on detailed geoinformation data, 
cluster strategies, exact and approximation algorithms of graph theory, combina- 
torial optimization and ring closure heuristics form the optimization kernel. Beside 
the methodology, we present results of a benchmark project which was processed 
with one of our industrial partners. 



1 Introduction 

During the last decade the demand on broadband capacities has entailed 
an increasing changeover from conventional transmission techniques to the 
fiber optic technology [6]. In order to achieve a return on investment all major 
network carriers will be forced to push the implementation of the non existing 
and cost intensive last mile [9]. 

A variety of strategic planing tools focus on problems of laying optimiza- 
tion, discussion of reliability and traffic optimization in the core net domain 
[1]. Nowadays carriers hardly use any computational planning tools for cost 
estimation and optimization in the access net domain. The planning pro- 
cess is performed manually based on expert knowledge and therefore often 
yield suboptimal decisions. However, since the telecommunication market 
runs through a period of consolidation the neglection of economization po- 
tentials has to be denoted as a critical competitive disadvantage. 

In several discussions with our industrial partners it has turned out that 
the following problem is of substantial interest: minimize the implementation 
costs for a fiber optic network in the access net domain under consideration 
of the following constraints: (Problem Dimension) Up to 2000 connection 




248 



objects (access net nodes) in an area up to 4 square kilometers are assumed as 
appropriate access net dimension. (Real World Condition) Computations 
shall be done with respect to real world conditions including the local spatial 
topology. The implementation costs are mainly determined by underground 
work; material costs, e.g. cable costs or pipe installation costs, are taken into 
account by a fixed penalty per meter. (Network Topologies and Reliabil- 
ity) Umbrella networks and considering standards of quality (e.g. required 
by COLT Telecom) also meshed networks shall be supported such that all 
access net nodes are supplied in a reliable way under acceptance of branch 
lines with a predefined maximum length of Kmax meters. 

2 Umbrella Networks and Meshed Networks in the 
Access Net Domain 

In order to consider real world conditions detailed geoinformation data, usu- 
ally supplied by the network carriers, are essential. According to the meth- 
ods in [8] the following information is extracted from the CIS databases: 
(Penalty Grid) A spatially balanced score card combines all relevant land 
classes with typical implementation costs. The access net domain is regarded 
as a regular grid [xiXi^i] x [t/j,^j+i], i and j are indices of finite index sets. 
Each entry Pij of a penalty matrix P describes the specific implementation 
costs for the grid pixel [xiXi^i] x [2/j,yj-fi]. (Auxiliary geometries) Plane 
geometries of all relevant land classes are exported. The shapes of these ge- 
ometries are determined by a set S of auxiliary nodes. (Access net nodes) 
Finally, the set R of access net nodes is specified by the according coordinate 
vectors. 

Conceptually, we can formulate the route planning problem on an undi- 
rected graph G = (V,E]W). V is the disjoint union of R and 5, E is the 
set of possible edges [i,j] for i, j e V. W denotes a symmetric cost matrix 
measuring the edges of E with respect to the penalty grid P. Let Sk be an 
element of the finite set of K intersections for the edge [i,j] with the grid 
lines of P, whereby Sq z, sk ’= j and COORD (sk) are the according 
coordinates. Furthermore, we denote Pk as the entry of the penalty matrix 
P corresponding to the predecessor pixels of 5^. Thus the entries of W are 
defined as: 



Wi V “ COORD (s ,)||2 Pk+iF for i ^ j 

[ 00 otherwise 

P is a fixed penalty including material costs per unit distance and ||.||2 means 
the euclidian norm. 

Formally, an umbrella network represents a tree T which spans all access 
net nodes R under the consideration of the auxiliary nodes m S. If Et Q E 





249 



is the set of edges of T, we consider the optimization problem: 



T = arg min 

T 



E w'-.i 



T is spanning RU S' , S' £ p{S) 



( 2 ) 



p{S) means the power set of 5. (2) is known in literature as the steiner 
network problem in graphs [3]. Upper bounds to the optimal solution (2) are 
computed by local preprocessing techniques and polynomial time algorithms 
up to problem dimensions of |V| = 2000, \R\ = \V\ /2 under consideration 
of sparse graphs [7]. Nevertheless, due to the complexity dealing with real 
world geometries, the route planning problem lead to dimensions up to |y| = 
20000 nodes. Such problem dimensions prompted us to segment (2) in three 
hierarchical steps: determination of local clusters, local solution (2) in each 
cluster and computing cluster shortest spanning trees (cluster SST problem). 

Standard procedures based on linkage information and cophenetic cor- 
relation are used to cluster groups of access net nodes to local clusters 

with respect to W [2]. Thus the problem (2) is reduced to the graph 
Qk _ u for the k-th cluster preprocessed ac- 

cording to [4] and solved by the algorithm proposed by Hakami and Lawler 
[3]. Since the algorithm is exponential in the number of auxiliary nodes 
we restrict the investigations of spanning trees to the vertex sets U 5 , 
5' € p (S^) and \S \ < Preiax- In order to estimate the quality of achieved 
solutions 100 test problems with OR-Instances B, C, D and E proposed in the 
SteinLib library [7] were computed and yield the performance ratio estimate 
1 < {A{I)/OPT{I) I all instances /} < 1.2 whereby A{I) and OPT{I) 
mean the approximation and the optimal solution for the instance /, respec- 
tively. 

The segmentation to independent clusters allows an efficient distributed 
processing which is realized by a proprietary development based on the PAR- 
MAT toolbox. denotes the optimized tree in the A:-th cluster. 

In order to achieve a global umbrella network, we consider each cluster 
as a node of superior complete graph H, The costs of edge [i, j] is given by 
the cheapest paths between the trees and under consideration of all 
auxiliary nodes in 5. Finally we compute a shortest spanning tree problem 
on H to obtain the links between the local cluster trees. The union of the 
found cluster interconnecting links and the local cluster trees yields the cost 
optimized umbrella network T. 

For reliable network structures we have to find a cost optimized redundant 
network on G under acceptance of branch lines with a maximum distance 
of if max meters. In this case we propose an augmentation heuristic: if T 
is a subgraph of G we search for an cost optimized edge-disjoint subgraph 
G* C G such that T U G* is 2-connected excepting branch lines with a 
maximum length of if max meters. The computation of branch lines and their 
lengths for a given graph is based on depth first search algorithms for the 





250 



determination of blocks and cut vertices. Furthermore we define Gr := TUG*, 
Gr = (Vr^Er) as the meshed part and Gnr •= G\Gr the non meshed part of 
the considered network, respectively. If B is the set of leaves in T, we consider 
P (u, v) for n, V G JB as the cost optimal path from the u to v in G under the 
constraint that P (u, v)nT = 0. P (u, v) is denoted as a feasible augmentation 
path with costs w {P (u, v)). The united graph T U P (u, v) contains a circle 
C (u,v) depending on P{u,v). The variable l[C{u,v)] represents the sum 
over all edge lengths /(e) in the circle C {u,v) whereby /(e) := 0 for e G 
Er or e e P{u,v) and /(e) := ||i -JII 2 otherwise. l[C (u,v)] measures the 
length of that part of T which is made ring closed by including the feasible 
augmentation path P{u^v). Thus, we define 

K (u, v) := w [P (it, i;)] / / [G (u, i?)] (3) 

as the ring closure relation for the leaves u and v. k{u,v) estimates the 
quality of a ring closure between the leaves u and v. An iterative procedure 
inserts leave connecting paths in Gr with respect to (3) until the maximum 
branch line length is lower than Smax- This greedy typed procedure yields 
the ring closed network as an upper bound for the searched optimal one. 
In order to improve we propose an edge deleting stingy postprocessing as 
described in [5]. 



3 An Industrial Benchmark Project 



We regard an access net domain provided by NetCologne, one of the major 
private city net carriers in Germany. We were able to benchmark the methods 
described above under consideration of the following two aspects: validity 
of the underlying spatially balanced score card and the demonstration of 
economization with respect to the traditional process of manual planning. 

Figure 1 shows the penalty grid resulting from the spatially balanced 
score card and figure 2 depicts the according auxiliary geometries, the clus- 
tering and all auxiliary points in the first cluster. We have recalculated the 
NetCologne laying of an already existing umbrella network in order to check 
the validity of our computational approach and the result was confirmed by 
experienced networks planers within an accuracy of ±5%. We have solved 
the local problems (2) and the cluster spanning tree problem for the um- 
brella network in distributed mode on three workstations using 223 minutes 
of CPU-time. The result is depicted in figure 3 and saves 22% distance and 
20% costs with respect to the NetCologne laying. We have also computed a 
meshed network with maximum allowed branch length Kmax — 200 meters 
as shown in figure 4. Even this meshed network yields a distance and cost 
reduction of 20% and 11%, respectively. 





251 




easting [m] 

Fig. 1. Penalty grid for the NetCologne 
project. Gridding is done with a reso- 
lution of one meter in square and in- 
creasingly colored from blue to red with 
respect to the average implementation 
costs [Euro/meter]: sidewalk = 87, stree 
= 110, unsurfaced area = 54 and cross- 
ing = 123. Branch lines for connection 
objects inside of parcels are weighed by 
52 Euro/meter. 




63 263 463 663 669 1063 1269 



easting fm| 

Fig. 2. 37 access net nodes are marked 
by red triangles. The black circles depict 
the auxiliary nodes of the first cluster 
based on the underlying auxiliary ge- 
ometries. The problem dimension are: 
for |W| = 1260, I = 5, for 
|y2| ^ 1935,117^1 = 19 and for 
|P^| = 2150, |i^^| = 13, total 1^1 = 
5345. 



4 Concluding Remarks 

The accuracy of the penalty matrix as an output of the spatially balanced 
score card has to be seen as crucial factor for validity of computational results. 
In general the specific implementation costs are influenced by a great variety 
of influencing parameters. Therefore we have to emphasize at this position 
that the above methodology is dimensioned for a rough planning process of 
network layings. The details and the according accomplishment on location 
continue to be task of experienced construction engineers, but may be done 
in a structured way based on a cost optimized approach. 

The consideration of usable infrastructures (subway, pipelines, partial 
renting existing networks of competitors and others) as well as investiga- 
tions with respect to improved optimization routines allowing the treatment 
of enlarged cluster dimensions are topic of current work. It shall be noticed 
that improvements of the optimization performance have to be challenged 






252 



69 269 469 669 669 1069 1269 

ea&ting {m] 





fiS 269 469 669 869 1969 1269 

easting |m] 



Fig. 3. Computed umbrella network for 
37 access net nodes. 



Fig. 4. Meshed network. Reliability was 
exclusively required for the upper part 
from ^-coordinates 877 up to 1677. 



under the consideration of the rough planning process and the inaccuracy of 
the underlying spatially balanced score card. 

References 

1. Bertsekas, D. (1998) Network Optimization: Continuous and Discrete Models. 
Athena Scientific, Belmont 

2. Blashfield, R. (1984) Cluster Analysis. Sage Publications, Santa Barbara 

3. Cieslik D. (1998) Steiner Minimal Trees. Kluwer Academic Publishers, Dor- 
drecht. 

4. Duin, C. (2000) Preprocessing the Steiner Problem in Graphs. In Du, D.Z.. et. 
al. (eds.). Advances in Steiner Trees. Kluwer Acadamic Publishers, Dordrecht 

5. Fortz, B. (2000) Design of Survivable Networks with Bounded Rings. Kluwer 
Academic Publishers, Dordrecht 

6. Gilder G. (2002) Telecosm: The World after Bandwith Abundance. Touchstone 
Books, Chaxmichael 

7. Koch, T., et.al. (2001). SteinLib: An Updated Libaxy on Steiner Tree Prob- 
lems in Graphs. In Cheng, X., et.al. (eds.), Steiner Trees in Industry. Kluwer 
Acadamic Publishers, Dordrecht 

8. Paulus, G., et al. (2002) Planning the Next Generation Network Infratructure 
in Urban Areas. Proc URISA Ann Conf, in print. 

9. Stogner, H., et al. (2001) Bin Top-Down Approach fiir die Planung von pho- 
tonischen Netzen. Telematik. 4, 4-6 







Ein Algorithmus zur sicheren elektronischen 
Stimmabgabe iiber das Internet 



Alexander Prosser^ 

Abteilung Produktionsmanagement 
Wirtschaftsuniversitat Wien, Augasse 2-6, A- 1090 Wien 
E-mail: alexander.prosser@wu-wien.ac.at 



Robert Muller-Torok^ 

Beratungsgesellschaft fiir Beteiligungsverwaltung Leipzig mbH 
Ferdinand-Rhode-Strafie 16, D-04107 Leipzig 
E-mail: mueller-toeroek@BBVL.de 



1 Einfuhrung 



1.1 Anforderungen 

In einem grundlegenden Beitrag identifizieren Nurmi, Salomaa und Santean [1] 
zwei Dienste im Rahmen eines E-Voting-Systems: den Registrator, bei dem eine 
elektronische Wahlberechtigung eingeholt wird, und die Stimmabgabestelle (e- 
lektronische Urne). Im folgenden wird die Eignung von aus der Literatur bekann- 
ten Verfahren diskutiert und ein eigener Vorschlag vorgestellt.^ 



^ Die Forschungsarbeit dieses Autors wurde durch den Jubilaumsfonds der Stadt Wien un- 
terstiitzt. 

^ Der Artikel gibt die personliche Meinung des Autors wieder. 

^ Nicht-kryptographische Verfahren, wie etwa die Ausgabe von LANs fiir 
Wahl“transaktionen“, werden hier nicht beriicksichtigt, da sie die Anonymitat nicht si- 
chem konnen. Kolludieren etwa die Stelle zur TAN-Ausgabe und die elektronische „Ur- 
ne“, so sind iiber die TAN die Stimmen den einzelnen Wahlberechtigten zuordenbar. Die 
Einhaltung der Wahlgrundsatze (hier Anonymitat) hangt daher von der Integritat der 
Wahlveranstalter ab, garantiert werden kann sie dem Wahler nicht. 




254 



1.2 ANDOS-basierte Verfahren 

Ein entsprechendes Protokoll wurde von [2] vorgeschlagen und geht von einem 
Verkaufer aus, der n geheime Bitmuster zu vergeben hat, die von einer Anzahl 
von Kaufem anonym erworben warden (fur die Anwendung auf Wahlen s. [3], 
[4]). Das Protokoll gilt als sicher, gilt aber als schlecht skalierbar. GemaJJ Proto- 
koll muB eine zufallig zusammengestellte Gruppe von Wahlem so viele Zufalls- 
zahlen bilden, wie es Teilnehmer in der Gruppe gibt, und das Protokoll durchfuh- 
ren, auBerdem muB eine Gruppe von ,JCaufem“ das Protokoll gleichzeitig 
durchspielen, was die Praktikabilitat waiter einschrankt. 



1.3 Zero-knowledge proof (ZKP)-Protokolle 

Diese Verfahren wurden fur den Dialog einer Chipkarte mit einem Kartenleser 
entwickelt, wobei die Karte den Kartenleser iiberzeugt, im Besitz der zur Chipkar- 
te gehorenden geheimen ID zu sein, ohne daB diese bekanntgegeben wird (fiir 
konkrete Verfahren von ZKP-Protokollen s. [5] auf RSA-Basis und [6] auf Basis 
diskreter Logarithmen. Es wurde vorgeschlagen (s. [7] und die darin angefuhrte 
Literatur) ZKP-Protokolle fur die anonyme Stimmabgabe zu verwenden, wobei 
aus der Wahler-ID ein nicht-riickrechenbares Token t gebildet wurde (bei Ver- 

'y 

wendung des Fiat-Shamir-Protokolls w^e dies t = ID mod(n) mit n als dem 

Produkt zweier „ausreichend groBer“ Primzahlen. 

Dieses Token wird bei einem Registrator vor der Wahl vom W^enden bean- 
tragt und auf einer Chipkarte zwischengespeichert. Bei der Stimmabgabe identifi- 
ziert sich der Wahlende mit seinem Token, wobei er seine Berechtigung, das To- 
ken einzusetzen, mit der Kenntnis (aber eben nicht Bekanntgabe) der ID 
nachweist. Wie die Analyse des Einsatzes von ZKP-Protokollen fur E-Voting in 
[8] zeigt, kann die Anonymitat dem Wahlenden auch hier nicht garantiert werden. 
Zumindest bei der Registrierung ist der Zusammenhang zwischen der ID des 
WMers und dem Token bekannt. Kolludieren daher Registrator und Ume, kann 
die Anonymitat durchbrochen werden. 

1.4 Das FOO-Protokoll 

Das Verfahren von Fujioka, Okamoto und Ohta [9] basiert auf der blinden Sig- 
natur wie sie von David Chaum [10] vorgeschlagen wurde. Es ist die Basis far 
zahlreiche E-Voting-Implementierungen, beispielsweise das System Sensus von 
Cranor und Cytron [1 1]"^. Im FOO-Protokoll erfolgen Registrierung und Stimmab- 
gabe in einem Schritt: 

1. Der Wahlende bildet ein asymmetrisches Schlixsselpaar {k,k') nach 
einem beliebigen Verfahren. 



Das deutsche Produkt i-vote (http://www.e-vote.de, vormals 
http://www.intemetwahlen.de) basiert deklariertermafien auf der blinden Signatur; eine 
definitive Aussage beziiglich des verwendeten Protokolls ist aber nicht moglich, da der 
Algorithmus unseres Kenntnisstandes nach nicht veroffentlicht wurde. 





255 



2 . 

3. 

4. 



5. 



6 . 



Der Registrator bildet fur alle Wahler (einer bestimmten Gruppe) 
ein Schliisselpaar {e,d) nach dem RSA-Verfahren mod{n) . 

Der Wahlende W fordert vom Registrator einen Stimmzettel SZ an, 
fullt diesen aus und sendet x = r^A:(*S'Z)mod(w) mir r als Zufalls- 



zahl an den Registrator zuriick. 

Dieser priift die Wahlberechtigung des Wahlenden bzw. ob dieser 

bereits eine Stimme abgegeben hat und sendet im Gutfall zu- 
riick, das gemafi blindem Signaturverfahren zu 



— = k{SzY mod{n) aufgelost wird. 
r 

Der Wahlende sendet seine so autorisierte Stimme \c{SZ\k{SZ) 

an die Ume, die die Autorisierung uberpriift und die (noch mit k 
codierte) Stimme abspeichert. 

Diese codierten Stimmzettel werden veroffentlicht und abschlie- 
Bend sendet der Wahlende k' an die Ume, die damit k{SZ) offnet. 



Der letzte Schritt wird im urspriinglichen Protokoll zeitlich getrennt durchge- 
fuhrt, die den Autoren bekannten Implementiemngen lassen diesen Schritt jedoch 
unmittelbar auf Schritt 5 folgen. Die Darstellung konzentriert sich auf das eigent- 
liche Stimmabgabeprotokoll, selbstverstandlich sind alle Kommunikationsvorgan- 
ge kryptographisch gesichert sowie durch digitale Signatur authentifiziert (letzte- 
res gilt natiirlich nicht fur das Einreichen des Stimmzettels und des k' in den 
Schritten 5 und 6 durch den Wahlenden. 



1.5 Kritische Wurdigung des FOO-Protokolls 

Wahrend auf Anwendungsebene die Anonymitat des Wahlenden durch die 
blinde Signatur bei Verwendung geniigend groBer Schliissel gesichert ist, kann auf 
Systemebene die Anonymitat durch dieses Verfahren nicht sichergestellt werden, 
da (identifizierte) Registriemng und (anonyme) Stimmabgabe in einem Schritt er- 
folgen. Eine Riickverfolgung des Wahlenden ware iiber die IP-Adresse oder iiber 
Temporardateien (z.B. Cookies) moglich. Dieses Problem hat unabhangig vom 
verwendeten Algorithmus jedes in einem Schritt ablaufende E-Voting-Protokoll. 

Weiters haben die Administratoren des Systems die Moglichkeit Stimmen fiir 
Nichtwahler einzuschleusen. Sie generieren dazu Schliisselpaare {k,k'), zu denen 
die entsprechenden Stimmzettel ins System eingeschleust werden; wobei schlieB- 
lich die „Wahlenden“ ihr k' einreichen. Dies kann am Ende der Wahl fiir Nicht- 
wahler durchgefiihrt werden, wenn dabei auch die entsprechenden Logfile- 
Eintrage manipuliert werden, kann eine korrekte Historie der Stimmabgabe kon- 
stmiert werden. 





256 



2 Ein E-Voting-Protokollvorschlag 



2.1 Das Protokoll 



Der hier gemachte Protokollvorschlag ist eine Weiterentwicklung des Algo- 
rithmus von [12]. Wie im FOO-Protokoll existieren ein Registrator und eine elekt- 
ronische Ume. Das Protokoll lauft aber hier in zwei zeitlich voneinander strikt ge- 
trennten Phasen ab: der Registrierung, bei der ein blind signiertes Wahltoken 
ausgegeben wird, und der Stimmabgabe. Wie in der Darstellung des FOO- 
Protokolls werden die generelle Verschliisselung und digitale Signatur der ausge- 
tauschten Nachrichten weggelassen, da es sich um Standardfunktionalitat handelt. 

Registrierung: 

1. Der Registrator halt ein nach dem RSA-Verfahren gebildetes Schliis- 
selpaar {e,d) pro Wahlsprengel w bereit; jedes fiir die Wahl ver- 

wendbare Trust Center ein Schliisselpaar {e.S). 

2. Der Wahlberechtigte sendet sein Trust Center-Zertifikat an den Regist- 
rator, der nach Auflosen des Zertifikats gegen das Trust Center und 
der Priifung der Wahlberechtigung die Kennung des Sprengels w und 
den zu w passenden Schliissel e . 

3. Der Wahlberechtigte bildet zwei asymmetrische Schlixsselpaare {u,t) 
und (c7,t), dies kann, muB aber nicht nach dem RSA-Verfahren 



geschehen. Es wird b = r^t(modn) und /i = p^r(modn) gebildet 
mit r, p als Zufallszahlen. Er signiert nun ein Paket aus b , w und 
einem standardisierten Antragstext fur die Ausstellung eines Wahlto- 
kens und sendet dies an den Registrator. Nach der Priifiing der 

Wahlberechtigung signiert der Registrator b^ und sendet dies an den 
Wahlenden zuriick. Dieser dividiert durch r underhalt 

4. Der analoge ProzeB wird mit dem Trust Center durchgefiihrt, wodurch 
der Wahlberechtigte den zweiten Teil seines Wahltokens erhMt, 




Zwischenspeicherung 

1. Der Wahlberechtigte speichert nach Ende der Registrierung seine bei- 
den Tokenteile und die geheimen u,0) auf einem 

sicheren Medium ab (zur Implementierung vgl. Abschnitt 2,3). 

Stimmabgabe 

Bei den folgenden Kommunikationsschritten verwendet der Wahlende seine 
digitale Signatur und auch beim Trust Center gehaltene Public Key Kryptographie 
nicht, da diese Schritte ja streng anonym erfolgen sollen. 

1 . Der W^ende generiert ein Paar an Session Keys {k, k') und 

2. sendet die beiden Tokenteile, w , die Kennung seines Trust Centers 
und k' an die Ume. 





257 



3. Diese kennt alle relevanten e und e und uberpriift, ob j -t und 

(t^ = r , ob das Token bereits eingesetzt wurde und ob w und e 

zusammenpassen. Im Gutfall wird ein laufend numerierter Stinimzet- 
tel SZ^ verschliisselt als k'{SZ^) an den WMenden gesandt. 

4. Dieser decodiert mit k , ffillt aus und sendet die beiden Token- 
teile, k\ die Trust Center Kennung, *SZ^,^(*SZ^),w(*SZ^) an die 
Ume. 

5. Diese priift nochmals die Berechtigung des Tokens und ob dieses noch 
nicht eingesetzt wurde und pruft im Gutfall die Authentizitat des To- 
kens wie in Schritt 3. 

6. Zusatzlich priift die Ume, ob t{u{SZ ^ )) = 5Z^ und 
t{gj{SZ^)) = SZ^ . Im Gutfall speichert die Ume die erhaltenen Para- 
meter ab und iibergibt SZ^ an die Stimmenauszahlung. 



2.2 Kritische Wurdigung 

Die Anonymitat des Wahlenden wird durch die blinde Signatur auf Anwen- 
dungsebene gesichert, da (identifizierte) Registriemng und (anonyme) Stimmab- 
gabe in getrennten Schritten erfolgen, kann eine Riickverfolgung iiber die IP- 
Adresse technisch ausgeschlossen werden - nur wenn Registriemng und Stimm- 
abgabe vom selben Endgerat aus bei starrer IP-Adrefivergabe erfolgen, ist eine 
Riickverfolgung technisch iiberhaupt machbar. 

Anders als beim Algorithmus von Fujioka et al. kann der Administrator keine 
Stimmzettel fur sich der Wahl enthaltende elektronische Wahler einffigen, da nicht 
nur das vom Registrator signierte Token, sondem auch das eines zertifizierten 
Tmst Center notwendig ist. Fiigt der Administrator doch Stimmzettel ein, ist dies 
erkennbar, die Stimmzettel konnen nachtraglich entfemt und das amtliche Ergeb- 
nis entsprechend korrigiert werden. Auch der Wahler kann nicht nachtraglich be- 
haupten, eine andere Stimme abgegeben zu haben, da er sich durch die Schliissel- 
paare (uj) und {cd,t) selbst gebunden hat, wobei t und r durch den Registrator 
authentifiziert werden. 



2.3 implementierung 

Die Zwischenspeichemng des Wahltokens setzt ein sicherer, an die Person ge- 
bundenes Medium voraus. Sinnvoll eignen sich dafiir Chip- bzw. Signaturkarten, 
wobei die Identifiziemng des Wahlers {ID) erfolgt entweder iiber die weltweit 
eindeutige Chipkarten-ID, was aber voraussetzt, daB der Wahlveranstalter die Kar- 
te selbst ausgegeben hat und bei der Ausgabe die Verbindung Person-Karte ge- 
speichert hat (wie dies beispielsweise beim Studentenausweis der Wirtschaftsuni- 
versitat Wien der Fall ist), oder aber iiber eine eindeutige Personenkennzahl, die 
im Zertifikat gespeichert ist, beispielsweise iiber die Personenzahl des Zentralen 
Melderegisters in Osterreich („Burgerkarte“). 





258 



In Osterreich angebotene Signaturkarten allerdings halten das Zertifikat in ei- 
nem frei auslesbaren Bereich, so dafi die Karte fur die anonyme Stimmabgabe erst 
nach einem entsprechenden Redesign geeignet ist; so konnte etwas das Zertifikat 
PIN-gesichert werden. 



Literatur 

[1] H. Nurmi, A. Salomaa, L. Santean. Secret ballot elections in computer net- 
works. Computers and Security 36 (10):553-560, 1991. 

[2] G. Brassard, C. Crepeau, J.-M. Robert. All-or-Nothing Disclosure of Se- 
crets. Proceedings of CRYPTO 86, Springer Verlag, 234-238, 1987. 

[3] A. Salomaa. Verifying and recasting secret ballots in computer networks. 
New Results and New Trends in Computer Science. Springer- Verlag, Berlin, 283- 
289, 1991. 

[4] A. Salomaa, L. Santean. Secret Selling of Secrets with Several Buyers, 
EATCS Bulletin 42: 178-186, 1990. 

[5] A. Fiat, A. Shamir. How to prove yourself: Practical Solutions to identifica- 
tions and signature problems. Advances in Cryptology - Crypto 86. Springer- 
Verlag, Berlin, 1987. 

[6] C.P. Schnorr. Efficient identification and signatures for smart cards. Journal 
of Cryptology {4):\6\-\lA, 1991. 

[7] H.C.A. van Tilborg. Fundamentals of Cryptology. Kluwers Academic Pub- 
lishers, Boston, 316ff, 2000. 

[8] A. Prosser, R. Muller-Torok. E-Voting - eine neue Qualitat im demokrati- 
schen EntscheidungsprozeB; akzeptiert durch Wirtschaftsinformatik. 

[9] A. Fujioka, T. Okamoto, K. Ohta. A Practical Secret Voting Scheme for 
Large Scale Elections. Advances in Cryptology - AUSCRYPT92. Springer- Verlag, 
Berlin 244-251, 1993. 

[10] D. Chaum: Blind Signatures for Untraceable Payments. D. Chaum, R.L. 
Rivest, A.T. Sherman. Advances in Cryptology, Proceedings of Crypto 82, 199- 
203, 1982. 

[11] L.F. Cranor, R.K. Cytron. Sensus: A Security-Conscious Electronic Poll- 
ing System for the Internet. Proceedings of the Hawai International Conference 
on System Sciences (HICSS-97). Hawai, 1997 downloaded from 
http://lorrie.cranor.org/pubs/hicss/hicss.html (4.2.2001). 

[12] A. Prosser, R. Muller-Torok: Electronic Voting via The Internet. Inter- 
national Conference on Enterprise Information Systems, ICEIS. Setubal, 1061- 
1066, 2001. 





Accelerated MILP-Strategies for the Optimal 
Operation Planning of Energy Supply System 



Peter Hacklander and Johannes F. Verstege 

University of Wuppertal, Institute for Power System Engineering, Fuhlrottstr. 10, 
42097 Wuppertal, Germany, E-Mail: hacklaender@uni-wuppertal.de 



Abstract. In this paper a MILP-based optimization algorithm for the optimal 
operation planning of energy supply systems is presented. For this purpose, the main 
tasks of optimal operation planning are discussed and the Priority-Based Dynamic 
Search Strategy (PBDSS) for the solution of the optimization problem is developed 
taking into account different acceleration strategies. Exemplary results comparing 
standard branch-and-bound algorithm to the new search strategy demonstrate the 
significant improvement of the optimization process. 



1 Introduction 

Since the European directives for the internal markets in electricity and gas 
have been passed to manage the requirements for competition, the opening of 
the European energy markets is pushed forward by national laws and agree- 
ments. Although new opportunities arise for energy supply systems in liberal- 
ized markets, nevertheless, the operators of the systems are forced to produce 
and distribute energy more efficiently to increase their competitiveness. One 
substantial chance to reach this aim is the optimal operation planning which 
enables the economical operation of the energy supply system considering the 
following tasks [2]: 

• Optimal commitment of plants and contracts: The main problem of the 
planning process is to determine the optimal commitment of all plants 
and contracts (i.e. to minimize the operational costs) with respect to 
all technical, contractual and further restrictions (better known as unit 
commitment and economic dispatch). 

• Observation and evaluation of energy markets: Apart from the tradi- 
tional operation planning, the operator now has to observe and evaluate 
the short-term electricity and gas markets in order to reduce the opera- 
tional costs. Thereby, not only the economic benefit has to be taken into 
account, also the technical feasibility must always be guaranteed. 

• Management of storages: The economical operation of energy supply sys- 
tems requires a preferably simultaneous energy generation (or purchase) 
and consumption. Unfortunately, this criterion can often not be fulfilled 
due to the varying demand. One way to achieve better exploitation is to 
use energy storages. Whereas electric power can only be stored with high 




260 



costs and only to a small degree, there are various possibilities to store 
energy (particularly district heating and natural gas) with lower costs 
and more efBciently [2]. 

As shown above, the optimal operation planning consists of different oper- 
ation tasks which have to be considered at once within one closed algorithm 
to meet the manifold interlinkings and to yield the most synergetic effect. 
This requirement leads to a complex optimization problem, which can only 
be solved with computer-aided methods of Operations Research. As shown in 
many practical applications, the Mixed-Integer-Linear-Programming (MILP) 
is a useful method to optimize complex power systems because it fulfills ex- 
actly every (even time-coupling) restriction introduced, it is able to find the 
global solution and to calculate the gap between the optimum and subopti- 
mal solutions and it enables the use of continuous as well as discrete vari- 
ables. Indeed, the MILP requires the mathematical formulation of the entire 
optimization problem and the optimization algorithm normally needs much 
computing time to solve complex optimization problems. The MILP-specific 
mathematical formulation of different energy supply systems has already been 
presented in several preceding works [3,4]. Therefore, this paper concentrates 
on the solution process of the formulated optimization problem. 



2 Solution Algorithm 

In practice, the computing times for the solution of the described optimiza- 
tion problem is limited depending on the considered planning period and 
the specifications of the operator. Besides, due to the increasing importance 
of short-term and spot-market contracts, operational decisions have to be 
made more frequent. However, as mentioned above, the MILP is - compared 
to other algorithms - rather slow. Thus, a special solution algorithm con- 
taining different acceleration strategies has been developed to improve the 
optimization process: 

The standard MILP algorithm first computes the so called primal solu- 
tion (PS) neglecting the discrete character of the binary variables (relaxed 
optimization problem) [Ij. Thereby, the used simplex algorithm can be sig- 
nificantly accelerated if existing primal results of a basis scenario are used as 
start values for the solution of similar optimization problems (e.g. scenarios 
with higher /lower demand or other energy prices). In different case studies 
it could be shown that the computing time can be even reduced up to 90%. 

After calculating the primal solution the subsequent branch-and-bound- 
algorithm (B&B) is used to find the first feasible integer solution. This is done 
by branching the binary variables successively to valid values 0 or 1 with 
the primal objective function value as a natural lower limit (the objective 
function value of a feasible integer solution can only be either equal or higher 
than the primal value). The standard B&B (sB&B) calculates the branching 





261 



order by an algorithm, which first estimates the effect of branching on the 
objective function for each binary variable and then chooses the variable 
with the greatest effect [1]. Unfortunately, this method holds the risk that 
variables, which only have a small effect on the objective function but a high 
influence on the feasibility of the optimization problem are branched very 
late resulting in an unsatisfactory convergence behaviour. Figure 1 illustrates 
the problem on the basis of two binary optimization variables: Whereas the 
solution algorithm has to examine six branches to notice the infeasibility of 
the optimization problem (left case), the algorithm can yet be finished after 
two steps if another branching order is used (right case). 



disadvantageous branching order advantageous branching order 






l>Kl infeasible subproblem 

Fig. 1. Comparison of different branching orders 



The described lack of the sB&B can be reduced if the branching order 
is not calculated by the mentioned estimation algorithm, but by a priority- 
based strategy: This uses the results of the PS - where all binary variables are 
relaxed to real numbers between 0 and 1 - to control the selection of branching 
variables more sophist icatedly taking advantage of the fact, that variables 
with a relaxed value near 0.5 have more influence on the feasibility if they 
are branched to 0 or 1 than variables nearby 0 or 1. Therefore, any discrete 
variable gets a priority between 1 (branched soon) and 1000 (branched late) 
which is calculated on the basis of the primal results as follows 



= l + 1998 • - 0.5| (1) 

where is the priority of variable i at the beginning of the B&B and Xi 
the relaxed value of variable i after PS. 

After finding a first feasible integer solution, further solutions are searched 
by the Dynamic Search Strategy (DSS), which removes two further drawbacks 
of the sB&B: 

• The objective function values of several feasible solutions often differ only 
slightly from each other, so valuable computing time is wasted without 
finding a significant better solution. 

• The last integer solution found cannot be treated as the global solution 
before the whole branching tree has been analyzed. Thus, the time be- 
tween finding the last solution and the proof that there are no better 
solutions can be unacceptable long. 





262 



The basic idea of the DSS (illustrated in Fig. 2) is not only to search 
integer solutions (depth-first search), but also to increase the lower limit 
(breadth-first search). This is used to decrease the accuracy of the last integer 
solution found, which becomes then 



<y = {vsoi - vll) fvsoi ( 2 ) 

where a is the accuracy, vsoi the objective function value of the last 
integer solution found and vll the value of the highest lower limit. While 
optimization the DSS changes alternately between the depth-first search to 
find better solutions and the breadth-first search until the accuracy remains 
under a given value. The changes are controlled by setting a computing time, 
which is heuristically determined. 



(a) ^| 1 — ^obj 

PS rSi 

(b) I ! M i Obj 

PS LL, IS, 

'A 1 >Obj 

PS LL, ISj 

(d) - I ■ l -H li *■ Obj 



PS LLj ISj 

PS : primal solution Obj : objective function value 
IS : integer solution LL : lower limit 

Fig. 2. Exemplary sequence of the DSS 



As a result, several possibilities to accelerate the optimization process 
could be successfully realized and have been merged to the Priority-Based 
Dynamic Search Strategy (PBDSS). 

3 Case Studies 

The functionality of the improved MILP-algorithm should be demonstrated 
on the basis of a complex energy supply system, which consists of 8 plants, 12 
purchasing contracts and 20 supplying contracts. The resulting mathematical 
optimization problem contains 4445 variables (732 of them binary), 5309 
restrictions and a coefficient matrix containing 21999 non-zero elements. For 
the calculation a PC Pentium III 900 MHz was used. 

First, the acceleration of the simplex algorithm should be examined: For 
this purpose, the PS of the considered energy supply system (basis scenario) 
and three further scenarios (characterized by higher demand, lower demand 





263 



and modified energy prices) are calculated. Subsequently, the PS of the three 
scenarios are calculated again, now using the results of the basis scenario 
as start solution for the simplex algorithm. In Table 1 the computing times 
for the PS with start solution are compared to those without start solution. 
Obviously, the computing times can be significantly reduced for all scenarios. 



Table 1. Computing times for the PS (in sec) 



scenario 


without start 
solution 


with start 
solution 


higher demand 


30.18 


7.68 


lower demand 


30.51 


6.92 


modified prices 


24.77 


13.56 



Following, the first integer solution of the basis scenario is searched once 
with sB&B (branching order determined by the estimation algorithm) and 
once with the PBDSS (branching order is affected by priorities calculated on 
the bcLsis of the PS). The obtained convergence behaviour of the two calcu- 
lations is presented in Fig. 3. The graph shows the remaining infeasibilities 
during the B&B process. At the beginning of the B&B there are about 400 in- 
feasibilities, which have to be reduced to zero until a feasible integer solution 
is found. 




Fig. 3. Convergence behaviour of sB&B and PBDSS 



In comparison to the sB&B the priority-based search yields in a more 
stable and faster convergence. Inspite of less computing time the objective 
function value of the first PBDSS solution is better than the solution of the 





264 



sB&B. This shows, that the assignment of priorities is a powerful method to 
improve the convergence of restrictive optimization problems. 




3 

2,5 
+ 2 



£ 



=3 

O 



t 0,5 
0 



0 100 200 300 400 500 600 700 800 900 

time [s] 

Fig. 4. Comparison of sB&B and PBDSS 



Figure 4 finally compares the whole optimization process of the sB&B 
and PBDSS over the limited time. The sB&B is able to find a multitude of 
feasible integer solutions whose objective function values differ only slightly 
from each other, so that a significant improvement of the accuracy can not 
be achieved. In contrast to the standard algorithm the PBDSS changes al- 
ternately between searching an integer solution and improving the accuracy. 
Therefore, the computing time can be used more reasonable to find better in- 
teger solutions in less time. Moreover, the accuracy of the last solution found 
can be continuously decreased and keeps the system operator - waiting for 
the optimal schedule - always up to date about the optimization process. 
Overall, the exemplary results show that a significant acceleration of the 
MILP-algorithm can be achieved by the PBDSS. 



References 

1. Beale E. M. L. (1998) Introduction to Optimization. Wiley, Chichester 

2. Hacklander P., Verstege J. F. (1999) Public cogeneration in Liberalized Energy 
Markets: Operation Concepts to Sustain Competitiveness. Euroheat Sz Power - 
Fernwarme International, Vol. 29, No. 1/2, pp. 77-85 

3. Illerhaus S. W., Verstege J. F. (1999) Optimal Operation of CHP-based Indus- 
trial Power Systems in Liberalized Energy Markets. In: IEEE (Ed.) Proceedings 
of the Power Tech ’99 Conference, Budapest, Hungary, BPT99-352-13 

4. Maubach K.-D., Verstege J. F. (1993) Long-Term Operation Planning in Com- 
bined Heat and Power Suppy Systems. In: PSCC (Ed.) Proceedings of the 11th 
Power Systems Computation Conference, Avignon, Prance, pp. 601-607 





On On-line Systems for Short-term 
Forecasting for Energy Systems 



Henrik Aalborg Nielsen, Torben Skov Nielsen, and Henrik Madsen 

Informatics and Mathematical Modelling, Technical University of Denmark, 
Richard Petersens Plads, Building 321, DK-2800 Kongens Lyngby, Denmark 



Abstract. The paper describes experiences with developing on-line computer sys- 
tems for short-term forecasting of wind power production and heat consumption 
in district heating networks. The computer systems are briefly described and some 
general aspects regarding system modeling with the purpose of forecasting are dis- 
cussed. One consequence of the approach used is that the stochastic properties of 
the forecast errors can not be inferred from the models generating the forecasts. 
With the purpose of using the stochastic properties as input to formal OR-models 
we discuss how these can be modeled. 



1 Introduction 

The authors participates in the development of two on-line computer systems, 
called WPPT and PRESS, for short-term energy forecasts. WPPT (Wind 
Power Prediction Tool) is a system for forecasting the wind power produced 
on wind turbines for up to 40-46 hours with a resolution of 30 minutes. 
PRESS (In Danish: PRognose og Energi Styrings System) is a system which 
forecasts the heat consumption in district heating networks with the same 
horizon as above, but with a resolution of 1 hour. Both computer systems run 
at a number of locations. The systems use on-line meteorological forecasts 
and local climate measurements together with on-line measurements of the 
response in order to continuously update the underlying models. 

Section 2 briefly describes the computer programs. In Sect. 3 the precon- 
ditions and some desirable properties of such computer systems are described, 
these set the basis for the methods used which are outlined in Sect. 4. Sec- 
tion 5 discuss how the stochastic properties of the forecast errors can be 
modeled. Furthermore, examples of stochastic decision problems are given. 
Finally, in Sect. 6 we conclude on the paper. 

2 Brief Description of the Computer Systems 

The computer systems work on-line. By on-line we understand that the sys- 
tems continuously receive the most recent information and update the re- 
sulting forecasts periodically (every 30 or 60 minutes). Both WPPT and 
PRESS have been coded in C/C-f + and run under Linux / Xll. The authors 




266 



believe that the stability of Linux or an other UNIX- variant is desirable when 
running an on-line application. 

WPPT [14] is a system for forecasting the wind power production in rela- 
tively large geographical regions and for individual wind farms. The forecasts 
for the individual wind farms are upscaled with the purpose of generating 
regional forecasts, see the end of Sect. 4. 

PRESS [6,13] is a system for forecasting the heat consumption in district 
heating networks and for controlling the supply temperature in order to re- 
duce heat loss. The controller, a Stochastic Generalized Predictive Controller 
[17], is an essential part of the system. Due to flow restrictions, the forecasts 
of heat consumption sets a lower limit on the supply temperature during cold 
periods. 

In Fig. 1 an overview of the information flow of the forecasting systems 
is depicted. Note that measured values of the dependent variable (e.g. wind 
power production) is used as input to the forecasting system. The reason for 
using climate measurements is that on the very short term, these are locally 
more adequate than meteorological (MET) forecasts [8, Sect. 6.5]. Also the 
climate measurements are needed in models which include low-pass filtered 
versions of the climate variables [8, Sect. 7.1]. The output of WPPT and 
PRESS also includes information regrading the uncertainty of the forecasts. 

3 Preconditions and General System Considerations 

The main information which is supplied to the computer programs is indi- 
cated in Fig. 1. Furthermore, information from the physical system, such as 
the fraction of wind turbines actually running i.e. not being out for main- 
tenance or other reasons, and time/calendar information is supplied to the 
computer systems. 

Except for the meteorological forecasts, typically the information can be 
sampled with the frequency required for the purpose of being able to update 
the forecasts with the desired frequency. However, the meteorological fore- 
casts is not updated very frequently, nor is the resolution very high [16] and 
interpolation is used to circumvent this. 

Since the physical systems considered are not stationary it is a precondi- 
tion for the computer systems to be able to adapt to changes in the physical 




Fig. 1. Overview of the information flow of the on-line forecasting systems. The 
dashed line on the plot of the forecast indicates the time at which the forecast is 
generated 





267 



systems. A typical example is that additional consumers are added to a dis- 
trict heating network. The computer system should detect this and adapt to 
the new situation without human intervention. 

While both of the physical systems are governed by the climate the heat 
consumption (PRESS) is also affected by diurnal / weekly variation not re- 
lated to the climate. Also, while wind turbines react almost instantly on 
changes in wind speed the time constants in district heating systems are 
much larger, see [8]. 

4 Outline of the Methods Used 

Let yt denote the dependent variable at time t. Note that we assume that the 
time scale is standardized so that the time steps can be indexed as . . . , t — 
1 , t, ^ 1 , t -|- 2 , . . .. 

If the possible dynamic dependence on the explanatory variables is neg- 
lected the dependence of yt on a single explanatory variable xt (e.g the wind 
speed) can generally be described as 

yt = f{xt) + et , (1) 

where / is an unknown function often assumed to be smooth and where 
et denotes the model error and a possible measurement error. The error is 
assumed to have a mean value of zero. 

The fc-step forecast of yt+k generated at time t is denoted yt-\~k\t- h / 
in (1) is known and since et-\-k is unpredictable then the forecast might be 
generate as 

yt+k\t = f{xt+k\t) , (2) 

where Xt^k\t is the meteorological forecast, e.g. of wind speed. It is clear that 
^t-\-k\t will deviate from the actual value Xt-\-k that will eventually act on 
the physical system. For this reason the use of /, assuming it is known, for 
the purpose of forecasting is not necessarily an optimal solution. The same 
holds for estimates of / based on (1). In fact for linear models it has been 
shown that it is generally better to use estimates based on the forecasts of 
the explanatory variables rather than on the actual explanatory variables [5]. 
See also the simulations in [8, Sect. 6.5]. 

In conclusion the / used for fc-step forecasts should be obtained as the 
estimate fk in the model 

Vt-\-k — fk{^^t-\-k\t) ^k,t-{-k • (3) 

Precisely how to estimate fk depends on which additional assumptions are 
applied to the problem. For wind power curve estimation local regression 
proves to be a good solution and especially the adaptive procedure described 
in [11,16] is well suited for this purpose in that the power curve is only up- 
dated and old information weighted down for wind speeds actually occurring. 




268 






05 

5 

a 



2 

? 



0.0 0.2 04 0.6 



6 to t5 20 25 



SD of log(error) 



Wind Spood (0} and S™ Forecast {xj 




Fig. 2. Forecast results based on (1) (dotted) and (3) (full). Left: Mean over 50 
simulations of Root Mean Squared forecast error (RMS) versus the measure of 
uncertainty of the wind speed forecast. Right: An example of the fits obtained 
using the uncertainty indicated by the vertical line on the left plot 



Figure 2 shows a simulated example where / in (1) is a line trough the ori- 
gin with a slope of 16 MW/ (m/s) and et is independent zero-mean Gaussian 
variables with a standard deviation (SD) of 5 MW. The wind speed consists 
of 17513 real measurements and the forecasts are simulated by multiplicative 
log-normal independent errors with a mean chosen to ensure the equality in 
means of forecasted and actual wind speeds. The fits are obtained using 200 
observations, while the remaining data is used for evaluating the forecasts. 
Note that with the particular error structure it will be advantageous to use 
a non-linear relationship in (3) although (1) is linear. 

Additional reasons exist why model (3) rather than (1) should be used 
for estimation when adequate forecasts are the main focus. Mainly this is 
because of the fact that the model, e.g. (1), defines a set of approximations 
to the physical system. The most appropriate approximation in this set will 
often depend on the forecast horizon. The reader is referred to [2,19,21] and 
[3, Sect. 7.3]. See also the results in [8, Sect. 5.7]. 

In reality model (3) is too simple. Currently the model used in WPPT 
is given by (18) and (19) in [16]. Besides a forecast of the wind speed the 
model uses forecasts of the wind direction and also includes a diurnal varia- 
tion. Furthermore, the most recent observed wind power production is used 
as an explanatory variable and fk is smoothed over k. For district heating 
systems the general conclusions outlined above still holds. However, because 
of the heat dynamics of buildings, in this case the use of low-pass filtered 
climate variables / forecasts will be appropriate when forecasting the heat 
consumption, see [8]. 

The procedure outlined so far assumes that the dependent variable is 
available on-line. For wind power production the values is available on-line 
for certain reference wind farms while data for smaller farms and individual 
turbines is available only trough the total wind power production of sub- 
areas, which is available at a lower frequency than the real on-line data. The 
upscaling to regional forecasts are described in [12,15]. 







269 



5 On the Use of Short-term Forecasts 

Generally the forecasts of heat consumption and wind power production is 
used in the daily decision process of the personal operating the energy supply 
systems. For district heating systems relevant decision problems are dynamic. 
For wind power the forecasts is used when trading via the Nordic Power 
Exchange (http://www.nordpool.com). Unit commitment and load dispatch 
can most adequately be performed by directly considering the covariance 
structure of the forecast errors. In reality all the decision problems are of 
stochastic nature and the covariance must be derived form the actual errors. 



5.1 The Covariance Structure of Forecast Errors 



Let et|s denote the forecast error when forecasting at time t given information 
up to time s. The forecast errors for a finite time span may be collected into 
a matrix in which the rows k = 1, . . . indicate the forecast horizon and 
the columns indicate the time at which the forecasts are generated, i.e. 



• • • 



L - • • ^t+K\t et-j-K+l\t+l 



(4) 



this constitutes a iiT-dimensional stochastic process. However, the full prop- 
erties of this process can not be derived from the forecast system itself. By 
modeling the most important aspects of the process it will be possible to 
simulate new realizations of the forecast errors, which is a first step towards 
finding good solutions to the stochastic decision problem. For describing the 
second-order properties (4) can be modeled as a multivariate ARMA process 
[4]. However, for wind power forecasts it is important to take into account 
that the variability depends on the level of forecasted wind power production 
and to a lesser extend on the time of day [9]. This can not be accomplished 
by a pure ARMA model, but may be accomplished by first standardizing the 
forecast errors using an estimate of the variability. This estimate can be ob- 
tained by smoothing the squared errors [9], see [20] for a general description. 



5.2 Examples of Decision Problems 

Control of Supply Temperature: With the purpose of reducing heat loss in 
district heating networks it is advantageous to keep the supply temperature 
at a reasonable low level, while water of a certain minimum temperature is 
supplied to the consumers. Due to restrictions in fiow capacity the heat de- 
mand sets a lower limit on the supply temperature. This limit is active in 
periods with a relatively high demand. During other periods the tempera- 
ture limit at the consumers dictates the supply temperature. The stochastic 
control problem just outlined is handled by PRESS. For an overview of the 
principle the reader is referred to [13], whereas [7] gives more details. The 
stochastic controller is described in [17]. 





270 



Production Planning for District Heating Plants: At district heating plants 
with cogeneration of heat and power, maybe with time- varying electricity 
prices, the decision problem may be how to use a heat accumulator for op- 
timizing the operation of the plant [10]. In [10] the criteria function of the 
dynamic optimization problem includes (i) revenue form selling power at a 
price varying over time, (ii) cost of purchasing natural gas at a price varying 
over time, (iii) cost related to starting and stopping the gas turbine, and 
(iv) maintenance costs depending on the amount of power produced by the 
plant. The decision problem is complicated by the stochastic nature of the 
forecast errors of the heat demand. In the future the problem may be further 
complicated by power and gas prices including a stochastic component. 

Optimal Scheduling of CHP Production: The optimal scheduling of CHP 
production may be performed on a weekly basis and in this case it consists 
of unit commitment and load dispatch. This problem is solved by SIVAEL 
[1,18]. One of the major stochastic components of this optimization problem 
is the deviation of the actual wind power production from the forecasts. Based 
on the principle described in Sect. 5.1 SIVAEL has recently been updated in 
order to accomplish this, see also [9]. 

6 Conclusion 

The preconditions and methods for short-term forecasts of heat consumption 
and wind power are outlined in the paper. Furthermore decision problems 
related to such systems are briefly described. It is argued that the stochastic 
properties of the forecast errors must be directly modeled in order to facilitate 
the use of these properties in a formal decision process. 

7 Acknowledgments 

The authors wish to acknowledge the cooperation with the partners with 
whom we have been working on short-term energy forecasts. These include 
ABB, Elkraft, Elsam, Eltra, KARA, Ramb0ll, Ris0, University of Lund, 
VEKS, and Vestkraft. Also we wish to acknowledge the financial support 
which we have received from the Danish Ministry of Energy, the European 
Commission, and from some of the partners mentioned above. 



References 

1. RB. Eriksen. Economic and environmental dispatch of power/CHP production 
systems. Electric Power Syst. Res., 57:33-39, 2001. 

2. M. Gevers. Identification for control. A. Rev. Control, 20:95-196, 1996. 

3. T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. 
Springer, 2001. 





271 



4. G.M. Jenkins and A.S, Alavi. Some aspects of modelling and forecasting mul- 
tivariate time series. J. Time Ser. Anal, 2:1-47, 1981. 

5. B. Jonsson. Prediction with a linear regression model and errors in a regressor. 
Int. J. Forecast., 10:549-555, 1994. 

6. H. Madsen, T.S. Nielsen, and H.Aa. Nielsen. Intelligent control. Danish Board 
of District Heating - News from DBDH, 3:14-16, 2001. 

7. H. Madsen, T.S. Nielsen, and H.T. Sogaard. Control of Supply Temperature. 
Dept, of Mathematical Modelling, Tech. Univ. of Denmark, Lyngby, Sept. 1996. 

8. H.Aa. Nielsen and H. Madsen. Predicting the Heat Consumption in Dis- 
trict Heating Systems using Meteorological Forecasts. Dept, of Mathemat- 
ical Modelling, Tech. Univ. of Denmark, DK-2800 Lyngby, Denmark, 2000. 
http : / / WWW . imm .dtu.dk/~han/ pub / efp98 . p df . 

9. H.Aa. Nielsen and H. Madsen. Analysis and simulation of prediction errors 
for wind power productions reported to NordPool (in danish). Informatics and 
Mathematical Modelling, Tech. Univ. of Denmark, Lyngby, 2002. 

10. H.Aa. Nielsen, H. Madsen, and T.S. Nielsen. Load schedul- 

ing for decentralized CHP plants. Dept, of Mathematical Mod- 
elling, Tech. Univ. of Denmark, DK-2800 Lyngby, Denmark, 2000. 
ht t p : / / WWW . imm .dtu.dk/ ~han /pub / efp98akk .pdf. 

11. H.Aa. Nielsen, T.S. Nielsen, A.K. Joensen, H. Madsen, and J. Holst. Tracking 
time-varying coefficient-functions. Int. J. of Adapt. Control and Signal Pro- 
cessing, 14(8):813-828, 2000. 

12. T.S. Nielsen, editor. Using Meteorological Forecasts in On-line Predictions of 
Wind Power. Eltra, DK-7000 Predericia, Denmark, 1999. ISBN: 87-90707-18-4. 

13. T.S. Nielsen and H. Madsen. Control of supply temperature in district heating 
systems. In Proceedings of the 8th International Symposium on District heating 
and Cooling, Trondheim, Norway, 2002. 

14. T.S. Nielsen, H. Madsen, and H.S. Christensen. WPPT - a tool for wind power 
prediction. In Proceedings of the Wind Power for the 21st Century Conference, 
Kassel, 2000. 

15. T.S. Nielsen, H, Madsen, H.Aa. Nielsen, G. Giebel, and L. Landberg. Pre- 
diction of regional wind power. In Proceedings of the 2002 Global Windpower 
Conference, Paris, 2002. 

16. T.S. Nielsen, H.Aa. Nielsen, and H. Madsen. Prediction of wind power us- 
ing time- varying coefficient-functions. In Proceedings of the XV IFAC World 
Congress, Barcelona, 2002. 

17. O.P. Palsson, , H. Madsen, and H.T. S0gaard. Generalized predictive control 
for non-stationary systems. Automatica, 30(12):1991-1997, 1994. 

18. J. Pedersen. SIVAEL simulation program for combined heat and power pro- 
duction. In International Conference on Applications of Power Production 
Simulation, Washington (EPRI/SEP), Jun. 11-13 1990. 

19. J.A. Rositter and B. Kouvaritakis. Modelling and implicit modelling for pre- 
dictive control. Int. J. Control, 74(11):1085-1095, 2001. 

20. D. Ruppert, M.P. Wand, U. Holst, and O. Hossjerin. Local polynomial variance- 
function estimation. Technometrics, 39:262-273, 1997. 

21. B. Wahlberg and L. Ljung. Design variables for bias distribution in transfer 
function estimation. IEEE Trans. Automat. Control, 31:134-144, 1986. 





Combining Bottom-up and Finance Modeiiing for 
Eiectricity Markets 



Christoph Weber 

Institute of Energy Economics and the Rational Use of Energy (lER), University 
of Stuttgart, Germany . E-mail: cw@ier.uni-stuttgart.de 



1 Introduction 

The high price volatility observed on liberalised electricity spot markets requires 
detailed modeling to successfully manage price risks. For example, in December 
2001 prices at the German power exchange LPX exceeded 1000 €/MWh for cer- 
tain hours, whereas the yearly average is around 22 €/MWh. In the USA even 
price spikes with more than 9,000 €/MWh have been observed. Also outside ex- 
treme events, price changes of 20 % from one day to another are not unusual. In 
order to cope with this volatility, price models are needed which reflect the par- 
ticularities of the electricity market, notably the non-storability of electricity. On 
the other hand, also the experience from other commodity markets has to be inte- 
grated in any model for electricity prices. Therefore, in the following a model is 
presented which combines several approaches. 



2 Methodology 

In the past, various models have been used to analyse electricity market prices and 
predict price levels and price risks (cf. Weber 2002). Each of these approaches has 
specific strengths and weaknesses, but especially when it comes to risk assessment 
in the middle and longer term, neither approach is very satisfactory by itself. Fi- 
nance mathematical models analyse adequately price fluctuations due to market 
movements but they cannot cope with the impact of changes in the production 
technology. Furthermore, given the seasonal, climatic and other time-varying ex- 
ogenous effects, long time-series in stable market environments would be required 
to estimate stochastic patterns with sufficient accuracy. On the other hand, dealing 
with stochastic effects in the fundamental models is hardly possible. Therefore, a 
combination of both approaches seems most appropriate. How this can be 
achieved, is sketched in Fig. 1. 




273 




Spot & Future prices 
oorresp, distribndra^ 



4. Electricity fading model j / 
Equilibrium price 

3. Model of utility strategies 

r — 

System marginal costs 
2. Fundament^ model 

f 

Primary energy prices 
1. Model for primary ener|y 






Fig. 1. General approach for the integrated electricity market model 

In a first step, the stochastic development of prices on the primary energy mar- 
ket is modelled through a finance-type model (cf. Sect. 2.1). The resulting energy 
prices are taken as an input to a fundamental model of the European electricity 
market (cf. Sect. 2.2). This model yields marginal generation costs differentiated 
by time of the day, type of day and month in the year. These prices could be used 
as input to a game-theoretic model yielding the prices and mark-up charged by 
strategic players in the market (“Model of utility strategies”). This part of the 
model has however so far not been implemented and is therefore not discussed in 
the following. Rather, the marginal costs are directly used as an input for a sto- 
chastic model of the electricity market (cf. Sect. 2.3). 



2.1 Model for primary energy prices 

An important characteristic of the primary energy markets for oil and gas is that 
periods of high price volatility are alternating with periods of rather stable prices. 
This can be dealt with using a model with switching regimes, such as the one pro- 
posed by (Erdmann 1995). However, a priori the moments of regime switch can 
be hardly determined and therefore an interesting alternative is to model the pri- 
mary energy markets through a mean-reversion process for the derivatives of (logs 
of) prices. The price changes evolve in this model around an equilibrium price 
drift, which could be theoretically derived e. g. from Hotelling’s rule. The model 
accounts for the fact that periods with low and with high price drift and volatility 






274 



are observed. Furthermore, the correlations between price changes for various en- 
ergy carriers are also accounted for. 

Empirical estimation results show that the price changes have a strong tendency 
to return to their equilibrium growth rate whereas this growth rate is rather close to 
zero. The stochastic component is most important for heavy fuel oil and lowest for 
coal. A further important point to be noted is the strong correlation between all 
price movements in the long run (correlation of price levels). All correlations ex- 
ceed 0.75. This has to be accounted for when doing Monte-Carlo-simulations of 
the price movements for primary energy carriers. 



2.2 Fundamental model 

Starting from given primary energy prices, this model determines the marginal 
generation costs in the Europe-wide power network. This is done using an LP 
model describing the minimization of the total variable system costs. 

The main restrictions to be included are the demand restrictions for each region, 
the capacity restrictions for each plant type and the transmission capacity restric- 
tions accounting also for the corresponding availabilities. Further equations de- 
scribe start-up costs as well as minimum start-up times and minimum down times. 
Since in the continental European power system hydro power from storage and 
pump-storage plants is quite of some importance, time coupling constraints for 
hydro storage have also to be considered. Therefore, one year is taken as optimiza- 
tion period and within that year, producers are assumed to have perfect foresight 
and static price expectations on foel prices. In the current version of the model, 
twelve typical days and six regions are distinguished. Electricity exchanges with 
regions outside the model scope are modelled exogenously. Data sources for the 
model include UCTE statistics on electricity load, electricity exchanges and 
transmission capacities, a database of power plants throughout Europe and fuel 
price statistics. Power plants are classified according to the main fuel used and the 
vintage into 21 classes in order to keep the problem size reasonable. Nevertheless, 
the optimisation problem has about 61,000 variables, 83,000 constraints and 
220,000 non-zero elements. Implemented in GAMS and using a CPLEX solver, 
computation time on a pentium III PC is about 3 minutes for a solve from scratch 
and about 10 s if an initial solution from a previous run is provided. 

The model is operated in two different ways: Firstly, it is used with historical 
fuel prices and demand to determine the equilibrium prices for past time periods 
which are then used as an input for estimating the stochastic electricity price 
model described below. Thereby, simulations are carried out for each historical 
month using the fuel price information available at that time. 

Secondly, the model is operated using Monte-Carlo-simulations of future fuel 
prices as input, in order to derive stochastic equilibrium prices. Due to the possi- 
bilities of fuel switches, variations in hydro-use or changes in import-export flows, 
the relationship between fuel prices and electricity prices is then far from being 
linear. 





275 



2.3 Electricity trading model 

Besides deterministic impacts from marginal generation costs, electricity prices 
are also subject to stochastic changes - due e. g. to load fluctuations, unforeseen 
outages or trader speculation. As others have shown (e. g. Gibson and Schwartz 
1990; Brennan 1991), electricity prices exhibit mean reversion. However the 
prices probably do not tend to return to a time-invariant mean, but rather to one 
which is determined by the fundamentals governing the electricity market. There- 
fore, the following model has been specified: 

d\apE^ =a£,/,(^A +ln;?G,, 

that is the price change is dependent on the difference between the equilibrium 
price and the actual price adjusted possibly by a constant and super- 
posed by a stochastic term Thereby, similarly to the approach adopted by 
(Ramanathan et al 1997), each hour is modeled by a separate stochastic process. 

It should be noted that part of the stochastic price fluctuations are due to varia- 
tions in the load or available capacity. Yet, the impact of these fluctuations de- 
pends on the slope of the merit order curve' . 

Therefore, the change A in marginal costs for a change in load by ±15 percent 
has been included as explanatory variable in a GARCH-type model of volatility: 

O’" {^E,, ) = «h£e.J + Ao-' (f )+ (1 - a* - A + n (A£, - ) (2) 

Alternatively, a specification has been tested where the variance is dependent 
on the price level of the previous day - modelling the fact that volatility tends to 
increase in electricity markets with price levels. 

It turns out that the model using the change in marginal cost variable A as ex- 
planatory variable outperforms an ordinary GARCH-model significantly in 1 1 out 
of 24 hours. In those hours, the model mostly also outperforms the alternative 
model formulation with In p^ as variance explaining variable. But in four other 
hours, the latter model performs considerably better than the first one. 



3 Results 

In Fig. 2. examples of price simulations are shown, which have been determined 
using the integrated model. Thereby the prices observed until end of June 2001 
have been used as data base for the model. On that basis, prices for August 2001 
have been simulated. The real market prices for the first and the last week in Au- 
gust have been also included in the figure. Shape and level of the simulated prices 
are similar to the observed ones although of course the exact price pattern is not 
matched. 



' The merit order curve is obtained through ordering the power plants by increasing mar- 
ginal generation costs. The merit order curve than gives the marginal generation costs as 
a function of cumulative capacity. 





276 




Hourly prices in August 2001 , simulations based on data end of June 2001 



Fig. 2. Comparison of simulated (as from 30 June 2001)and real spot market prices in Au- 
gust 2001 

Fig. 4 shows a comparison of price forecasts and uncertainty ranges implied by 
the integrated model with forward prices and real prices for November 2001. It 
turns out that the integrated model provides for this month a better forecast then 
the forward price does. Also the uncertainty range looks plausible. If compared 
with a pure mean reversion model, the uncertainty range is similar in both cases. 
But an important difference between the simple mean-reversion model and the in- 
tegrated model is that the uncertainty range in the integrated model increases with 
increasing forecasting horizon due to the impact of uncertain primary energy 
prices. The mean-reversion model on the contrary shows no increase after about 
one week 



4 Final remarks 

The combination of fundamental analysis and finance model for analysing elec- 
tricity prices yields interesting insights and more realistic estimates, both for ex- 
pected prices and price variations. The model developed so far clearly requires 
further testing and fine tuning, but it offers a useful basis for estimating prices and 
volatilities especially in the medium and long run. Furthermore it provides also a 
basis for estimating the impact of exogenous shocks (e. g. unforeseen outages) on 
price levels and price volatilities. 




277 



60.00 

[€mm] 

50.00 

40.00 

o 

□- 

^ 30.00 

o 

V 

S 20.00 

10.00 
0.00 




Base Nwamber 2001 



Peak Nowember 2001 



□ Inte^tsd 
Model 



■ Forward 
29.6.2001 



O Forward 
30.10.2001 



■ LPX 



Monthly average prices In November 2001 , forecasts based on data end of June 2001 



Fig. 3. Comparison of price forecasts of the integrated model (from 30 June 2001), forward 
prices and actual prices for standard products November 2001 



References 



Brennan MJ (1991) The price of convenience and valuation of commodity contingent 
claims. In: Lund D, Oksendal B (eds) Stochastic Models and Option Models. North- 
Holland, Amsterdam et al, pp 33-72 

Erdmann G (1995) An Evolutionary Model for Long Term Oil Price Forecasts. In: Wagner, 
A Lorenz H (eds) Studien zur Evolutorischen Okonomik III. Duncker & Humblot, 
Berlin, Munchen, pp 143-161 

Johnson B, Barz G (1999) Selecting Stochastic Processes for Modelling Electricity Prices. 
In: Jamson R (ed) Energy Modelling and the Management of Uncertainty. Risk Books, 
London 

Gibson R, Schwartz E (1990), Stochastic convenience yield and the pricing of oil contin- 
gent claim. Journal of Finance 45:959-976 

Ramanathan R, Engle R, Granger CWJ, Vahid- Araghi F, Brace C (1997), Short-run fore- 
casts of electricity loads and peaks. International Journal of Forecasting 13:161-174 

Weber C (2002), Electricity markets: Coping with price risk through integrated bottom-up 
and financial modeling. Working Paper, Stuttgart 





Gestaltung von Stoffstrom-Netzwerken zum 
Produktrecyclingi 



Prof. Dr. Thomas Spongier; Dipl.-Geookol. Grit Walther 

Technische Universitat Braunschweig, Institut fur Wirtschaftswissenschaften, Abt. 
BWL, insb. Produktionswirtschaft, Katharinenstr, 3, 38106 Braunschweig, Tel.: 
+49(531)391 2201, Email: t.s pengler@tu-bs.de : g.walther@tu-bs.de 



Abstract. Im vorliegenden Beitrag erfolgt die Vorstellung eines Konzepts zur 6- 
konomischen Bewertung von altemativen Stoffstrom-Netzwerken zum E- 
lektro(nik)altgeraterecycling. Dabei finden investitionsabhangige Kosten, variable 
Stofffluss- sowie Prozesskosten und sonstige Gemeinkosten Beachtung. Fiir die 
Ermittlung der Stofffluss- und Prozesskosten wird ein Mengengerust auf Basis ei- 
nes linearen Optimierungsmodells erstellt. Es erfolgt eine Anwendung am Beispiel 
der Unterhaltungselektronik fur das Bundesland Niedersachsen. 

Keywords. Stoffstrom-Netzwerk, Elektro(nik)altgerate, Recycling 



1 Ausgangslage, Zielsetzung, Vorgehensweise 

Vor dem Hintergrund der auf europaischer Ebene vor der Verabschiedung stehen- 
den Richtlinie iiber Elektro- und Elektronikaltgerate (WEEE), die die Verpflich- 
tung zur Riicknahme und zum Recycling von Elektro(nik)altgeraten enthMt, wird 
im vorliegenden Beitrag ein Ansatz zur Gestaltung von Stoffstrom-Netzwerken 
zur Erfassung und Behandlung von Elektro(nik)altgeraten entwickelt. 

Dabei wird zunachst auf die aktuellen und zukunftigen Entwicklungen im Be- 
reich der Behandlung von Elektro(nik)altgeraten eingegangen. Anschliefiend er- 
folgt der Entwurf eines Konzepts zur Gestaltung und okonomischen Bewertung 
von Altemativen des Stoffstrom-Netzwerks. Eine Daten und Ist-Analyse beste- 
hender Stoffstrom-Netzwerke wird im Anschluss an das Optimierungsmodell vor- 
gestellt. Ein Ausblick schlieBt den vorliegenden Beitrag ab. 



2 Aktuelle und zukunftige Entwicklungen im Bereich 
der Behandlung von Eiektro(nik)aitgeraten 

Im Marktsegment der Elektro(nik)gerate fixhren immer kiirzere Innovationszyklen 
sowie steigende Ausstattungsgrade der Haushalte zu wachsenden Altproduktmen- 



^ Das Vorhaben „Gestaltung und Lenkung von Stoffstrom-Netzwerken zum Recycling 
komplexer Verbundprodukte - am Beispiel Brauner Ware“ wird mit Mitteln der Deut- 
schen Forschungsgemeinschaft (Forderkennzeichen SP491/1) gefordert 




279 



gen. Das Aufkommen an Elektro(nik)altgeraten in Deutschland wird auf ca. 1,8 
Mio. t/a geschatzt [2], davon werden derzeit allerdings nur etwa 0,46 Mio. t/a er- 
fasst und ordnungsgemaB behandelt [4]. Problematisch ist dies vor allem vor dem 
Hintergrund des Schadstoffgehaltes der Altgerate, aber auch aufgrund der Wie- 
derverwertbarkeit der in den Geraten enthaltenen Wertstoffe. Eine auf europai- 
scher Ebene diskutierte Richtlinie iiber Elektro- und Elektronikaltgerate (WEEE) 
[3] beinhaltet die Veipflichtung zum Aufbau von Riicknahmesystemen, zur Ge- 
w^leistung einer kostenlosen Riicknahme der Altgerate sowie zur Einhaltung 
von Sammel-, Recycling- und Verwertungsquoten. 

Als Auswirkungen der Umsetzung der EU-Richtlinie WEEE sind zukiinftig vor 
allem fur die kleinen und mittelstandischen Demontageuntemehmen strukturelle 
Ver^derungen zu erwarten. Es stehen die in Tabelle 1 dargestellten Systemalter- 
nativen zur Auswahl, wobei die jeweils zuerst genannte Auspragung der aktuellen 
Situation entspricht. 



Tabelle 1. Altemativen fur Stoffstrom-Netzwerke beim Elektro(nik)schrottrecycling 



EinflussgroBe 


Auspragungsmoglichkeiten der Altemativen 


Behandlung der E- 


- dezentrales System (KMU mit Kapazitat von 2500 t/a) 


lektro(nik)altgerate 


- fortgeschrittene Zentralisierung 

- vollstandige Zentralisiemng 


Innerbetriebliche Ausstat- 


- unverkettete Einzelarbeitsplatze 


tung der Demontageun- 


- Linienverkettung 


temehmen 


- Hochautomatisierte Demontagesysteme 


Bereitstellung der E- 


- Dezentrale Bereitstellung (bei Kommunen) 


lektro(nik)altgerate 


- Zentrale Bereitstellung (Sortierzentmm) 


Vorsortierung der Produk- 


- unsortiert 


te 


- wenig sortiert (GroBgerate/ Kleingerate) 

- sortiert (groBe/ kleine WeiBe Ware/TV/ Braune Ware/ ITK) 


Zusammenarbeit der De- 


- Einzelakteur 


montageuntemehmen 


- regionales Netzwerk 

- strategisches Netzwerk 



Im vorliegenden Beitrag wird auf die folgenden Planungsaufgaben fokussiert: 

~ (Gesamt-)Kapazitat der am Stoffstrom-Netzwerk zu beteiligenden Akteure, 

- Anzahl der am Stoffstrom-Netzwerk zu beteiligenden Akteure, 

- Art der am Stoffstrom-Netzwerk zu beteiligenden Akteure, 

- Technikwahl innerbetrieblicher Demontagesysteme. 

Eine Gegeniiberstellung und Bewertung zukunftiger Systemaltemativen fur 
Demontage- und Recyclinguntemehmen fehlt derzeit. Uber die okonomische Effi- 
zienz der einzelnen Altemativen liegen keine Informationen vor. Aus diesem 
Grund wird im Folgenden ein Konzept zur Modelliemng und Bewertung altemati- 
ver Stoffstrom-Netzwerke zum Elektronikschrottrecycling vorgestellt. 






280 



3 Modellierungs- und Bewertungskonzept 

Das Ziel des Konzepts besteht in der Gestaltung von Stoffstromnetzwerken mit 
dem Ziel der okonomisch effizienten Demontage von Elektro(nik)altgeraten. Da- 
bei werden Spezialisienmgen von Demontageuntemehmen und somit die Realisie- 
mng von Lemkurveneffekten [5] vorausgesetzt. 

Der zu erwartende jahrliche Gewinn aller Akteure des Gesamtnetzwerks stellt 
ein allgemein anerkanntes und leicht interpretierbares Kriterium fur die okonomi- 
sche Vorteilhaftigkeit der einzelnen Systemaltemativen dar. Daher wird das fol- 
gende Modell [10] als Basis fur die okonomische Bewertung gewahlt: 

j^KrW __ ^KrW , j^KrW , irKrW . KrW ... 

^ ’^^Stoffluss '^^?Tozess ^sonst vD 

V ariablen: K -Entscheidungsrelevante Gesamtkosten, -investitionsabhangige 

Kosten, -Stoffflusskosten, -Prozesskosten, -sonst. Gemeinkostenanteile 

’^^Slojffluss ’^'■Prozess ^^sonsi 

Zur Ableitung der investitionsabhangigen Kosten werden in einem ersten 
Schritt jeweils die zum Aufbau der Infrastruktur erforderlichen spezifischen Inves- 
titionen ermittelt [6,1] Diese werden modular nach Logistik, Gebauden sowie 
Demontagesystemen fur unterschiedliche Kapazitaten bestimmt, um die Ableitung 
der investitionsabhangigen Kosten fur variable Abschreibungsdauem und Zinssat- 
ze zu ermoglichen (siehe Abb. 1). 




Abb. 1. Ableitung der investitionsabhangigen Kosten aus den spezifischen Investitionen 

Sonstige Gemeinkosten wie Energiebedarf, Versicherungen oder Instandhal- 
tungskosten konnen fur die verschiedenen o.g. Module prozentual aus der Investi- 
tion abgeleitet werden [1]. 

Die Basis zur Ermittlung der stofffluss- und prozessabhangigen Erlose und 
Kosten bildet ein auf der linearen Aktivitatsanalyse [7] beruhendes Stoffstrommo- 
dell. Es wird eine okonomisch effiziente Allokation der Stoffstrome auf die Ak- 
teure des Netzwerks vorausgesetzt, die mit Hilfe eines linearen Optimierungsmo- 
dells erfolgt. Die Systemgrenzen des Stoffstromnetzwerks beinhalten den 
Transport der Gerate zu den Demontageuntemehmen, die Demontage sowie den 
Verkauf der Demontageffaktionen. Auch Transporte zwischen den Demontage- 





281 



untemehmen sind moglich. Die Modellierung der Demontageoptionen erfolgt in 
Anlehnung an das gemischt-ganzzahlige Optimierungsmodell zur integrierten 
Demontage- und Recyclingplanung nach [8]. Aufgrund der hohen Stuckzahl an 
Altgeraten kann bei einer regionalen Betrachtung auf die Ganzzahligkeitsbedin- 
gungen verzichtet werden. In Abbildung 2 sind die Stoffstrome fur das Demonta- 
geuntemehmen u dargestellt. 



0 

0 



0 

0 




Abb. 2. Stoffstrommodell fur das Demontageuntemehmen u 



Indizes: I-Produkte (i = 1..P) u. Zerlegefraktionen (i = P+1...I), 
J-Demontageoperationen, Q-Quellen, U-Untemehmen, R-Senken 
Variablen: Y-Massen, X-Anzahl der Demontageaktivitaten 



Fiir die okonomische Bewertung des Stoffstrommodells finden ausschlieBlich 
variable Kosten in Form von Annahmeerlosen, Transportkosten, Sortierkosten, 
Zerlegekosten und Verwertungserlosen der Demontagefraktionen Beachtung. Das 
Ziel besteht in der deckungsbeitragsmaximalen Allokation der Produkte auf die 
Demontageuntemehmen (2). 



u 

E 



q=l 



MAX 



( 2 ) 



AQ 









Der Output eines Demontageuntemehmens y^iu berechnet sich nach Bilanzglei- 
chung (3) aus dem Input an Geraten aller Quellen des Netzwerks (y^^iuq) und aus 
dem Input an Geraten und Bauteilen anderer Demontageimtemehmen (y^^iuuO? 
sowie aus der Zerlegung dieser Gerate und Bauteile, ausgedriickt durch die De- 
montageaktivitat Vy multipliziert mit der Anzahl der Ausfuhmngen dieser Demon- 
tageaktivitat (xju). 

+ + = , i = l..I;u = l..U (3) 

J q = \ u'=l 



Der so berechnete Output des Demontageuntemehmens wird nach Bilanzglei- 
chung (4) zum einen an die Senken des Netzwerkes (y^^ur) sowie zur weiteren 
Zerlegung an andere Demontageuntemehmen (y^^iuuO geliefert. 

= ,i = l..I;u=l..U (4) 

m '=1 r 

u^u' 

Um zu gewahrleisten, dass die Massenfliisse zwischen den Demontageunter- 
nehmen abgebildet werden, sind diese nach Massenbilanzgleichung (5) so zu bi- 






282 



lanzieren, dass Liefenmgen des Untemehmens u an das Untemehmen u’ (y^'^mu’) 
den Annahmen durch das Untemehmen u’ vom Untemehmen u (y^^iu’u) entspre- 
chen. 



yiuu yiuu 



i = l..I;u = l..U,u’ = l..U 



(5) 



Die Kapazitatsrestriktionen bestehen in der znr Verfugung stehenden Altgera- 
temenge (6) und den Demontage- (7) sowie Recyclingkapazitaten (8). 






QMAX 



i=l..I;q=l..Q 



( 6 ) 



U=1 



Yc. X. . u=l..U 

JU JU u 



7=1 



y < y.** 

y ir 



i=l..I;r=l..R 



U=1 



(7) 

( 8 ) 



Des Weiteren besteht nach dem derzeitigen Entwurf der EU-Richtlinie iiber E- 
lektro- und Elektronikaltgerate der Zwang zur Schadstoffentfrachtung bestimmter 
Produkte (9). Zudem bestehen Nichtnegativitatsbedingungen (10). 



RMAx [=0 falls i zerlegepflichtig 
[> 0 sonst. 

^ .P 



I=l..P;r=l..R 



(9) 

( 10 ) 



Die Zusammenfuhnmg der investitionsabh^gigen und betriebsbedingten Fix- 
kosten des ersten sowie der variablen Erlose bei deckungsbeitragsmaximaler Al- 
lokation der Stoffstrome des zweiten Planungsschrittes nach Gleichung (1) erlaubt 
schlieBlich eine okonomische Bewertung aller Altemativen. 



4 Daten- und Ist-Analyse 

Das vorgestellte Konzept wird derzeit exemplarisch fur das Flachenland Nieder- 
sachsen fur die Unterhaltungselektronik angewendet. Im Rahmen der Anwendung 
erfolgte eine Ist-Analyse bestehender Stoffstrom-Netzwerke. Eine Vielzahl von 
Altgeraten wurde untersucht und insgesamt 7 Referenzgeraten zugewiesen, 39 
Demontagefraktionen und 31 Demontageoperationen sowie 13 Demontageunter- 
nehmen mit dem Tatigkeitsbereich der Zerlegung Brauner Ware wurden bestimmt. 
Es existieren 47 Offentlich-Rechtliche Entsorgungstrager (Quellen) sowie 56 Re- 
cycling-ZEntsorgungsstandorte (Senken). Eine detailliertere Darstellung der Da- 
tenerhebung findet sich in [9]. 

Das resultierende lineare Optimierungsproblem besteht aus ca. 71.000 nicht- 
ganzzahligen Variablen sowie ca. 11.000 Nebenbedingungen und kann durch iib- 
liche Losungsprozeduren fur lineare Optimierungsprobleme unter Benutzung des 
Lingo-Solvers und Zugriff auf Excel-Datenblatter in Sekunden gelost werden. 





283 



5 Ausbiick und weitere Vorgehensweise 

Zukiinftig erscheinen der Einbezug der Sammlung und mechanischen Aufberei- 
tung, eine detailliertere Betrachtung der Transportlogistik sowie die Beachtung al- 
ler Produktkategorien aus privaten Haushalten und eine Ausweitung der geografi- 
schen Grenzen auf ganz Deutschland interessant. 



6 Literatur 

1. BFUB (Hrsg) (2001) Kosten der EU-weit vorgesehenen Regelungen zur Behandlung 
von Elektro- und Elektronikschrott. Projektnummer B-I-01, Abschlussbericht 

2. Bvse (Hrsg) (1998) Elektronikschrottrecycling ~ Fakten, Zahlen und Verfahren 

3. European Commission (2001) Proposal for a Directive on waste electrical and elec- 
tronic equipment. Document 500PC0347(01) 

4. Halstrick-Schwenk M (2001) Umfang und Struktur der Entsorgungswirtschaft im Be- 
reich Elektroaltgerate/ Elektronikschrott in Deutschland. RWI-Papiere Nr 65 

5. Hesselbach J v. Westemhagen K (2001) An Integrated Approach for the Planning and 
Control of flexible Retro-Production Systems. Environmental Information Systems in 
Industry and Public Administration. IDEA Group Publishing, Hershey, USA 

6. Hetzel Elektronik-Recycling GmbH (2001) Konzeption und Realisierung eines indus- 
triell gepragten Verwertungsbetriebs fur Elektronik-Altgerate. Teilvorhaben 2 im Ver- 
bundprojekt IREAK, Schlussbericht 

7. Koopmans T C (1951) Efficient Allocation of Resources. Econometrica 19: 455-465 

8. Spengler T (1994) Industrielle Demontage- und Recyclingkonzepte - Betriebs- 
wirtschaftliche Planungsmodelle zur okonomisch effizienten Umsetzung abfallrechtli- 
cher Riicknahme- und Verwertungspflichten. Erich Schmidt Verlag, Berlin 

9. Spengler T Walther G Hesselbach J Ohlendorf M (2002) Product Assessment and 
Recycling Data Analysis as Precondition for Efficient WEEE Recycling. Going Green 
- Care Innovation 2002. Konferenz-CD-Rom 

10. VDI 3800 (2000) Ermittlung der Aufwendungen fur Mafinahmen zum betrieblichen 
Umweltschutz 





Ein Ansatz zur Bewertung von 
Remanufacturingstrategien 



Axel Tuma and Baptiste Lebreton 

Universitat Augsburg, Lehrstuhl fiir Umweltmanagement, 
D-86135 Augsburg, Deutschland 



Zusammenfassung Die Verschaxfung der europaischen Umweltgesetzgebung wie 
sie sich etwa in der Richtlinie des europaischen Parlaments iiber Elektro- und Elek- 
tronikaltgerate zeigt - und hierbei insbesondere die Erweiterung der Produktver- 
antwortung auf den gesamten Produktlebenszyklus - stellt fur Hersteller neue Her- 
ausforderungen dax. Dabei kommt dem Schliefien der Kreislaufe sowohl auf der 
Material- (Recycling) als auch auf der Komponentenebene (Remanufacturing) ei- 
ne zentrale Bedeutung zu. In diesem Kontext ist eine Weiterentwicklung der In- 
strumente des Supply Chain Management im Hinblick auf die Integration von 
Riickfliissen erforderlich. 

Analysiert man bisherige Forschungsarbeiten auf strategischer Ebene, ist eine 
Fokussierung auf WLP-basierte Modelle zu erkennen [3] [4], deren Aufgabe in er- 
ster Linie in der raumlichen Gestaltung von Kreislaufwirtschaftssystemen besteht. 
Vor diesem Hintergrund wird in der vorliegenden Arbeit ein Ansatz zur Bewertung 
von Remanufacturingstrategien vorgestellt, der die drei Schliisselfaktoren Marktseg- 
mentierung^ technologische Entwicklung und Riickflussverteilung hervorhebt. Dieser 
Ansatz wird anschlieBnd anhand eines Fallbeispiels aus der Reifenindustrie veran- 
schaulicht. 



1 Diskussion von Einflufigrofien auf das 
Remanufacturing-Potenzial 

Vergleicht man die Remanufacturingquote unterschiedlicher Produkte (wie 
etwa Kopiergerate, Druckerpatronen, Lkw-Reifen, mobile Endgerate u. Pkws) 
lassen sich folgende wesentliche Einflussgrofien auf das Remanufacturingpo- 
tenzial identifizieren: 

• Marktsegmentierung: Prinzipiell l^st sich feststellen, dass gewisse 
Kaufergruppen Vorbehalte gegen den Erwerb von Remanufacturingpro- 
dukten haben. Griinde hierfiir liegen nach Bellmann [1] u.a. in der psy- 
chologischen Obsolesenz, die wiederwendeten Produkten unterliegt. Da- 
mit ergeben sich unterschiedliche Marktsegmente fiir Neuprodukte und 
Remanufacturinggiiter. Grundsatzlich ist zu bemerken, dass Giiter mit 
primar funktionalen Eigenschaften wie z.B. Druckpatronen ein hoheres 
Remanufacturingpotenzial im Verhaltnis zu modeorientierten Produkten 
(z.B. mobile EndgerMe) aufweisen. 




285 



• Technologische Entwicklung auf Komponentenebene: Eine we- 
sentliche Vorausetzung fiir den Wiedereinsatz einzelner Komponenten 
(u.a. optische Module, Druckerkartuschen, Karkassen) liegt im Ausblei- 
ben technologischer Spriinge (Beispielsweise beim Ubergang von Analog- 
zur Digitaltechnologie), so dass eine Reintegration der alien Komponen- 
ten technisch moglich und okonomisch vertretbar bleibt. Kritisch in die- 
sem Zusammenhang sind einerseits der starke Innovationsdruck anderer- 
seits der Trend zur Miniaturisierung, die etwa bei mobilen Endgeraten 
beobachtet werden konnen. Im Modell wird die technologische Entwick- 
lung durch den Parameter S ausgedriickt, der angibt mit welcher Wahr- 
scheinlichkeit eine Komponente, die zum Zeitpunkt t’ gefertigt wurde 
zum Zeitpunkt t noch einsetzbar ist. 




Abbildungl. Technologisch bedingte Wiedereinsetzbarkeit in Abhangigkeit des 
Komponentenalters 

• Riickflussverteilung: Neben den technologischen Restriktionen spielt 
die Verteilung der Riickfliisse zur Planung der Remanufacturingaktivitaten 
eine zentrale Rolle. In diesem Zusammenhang konnen dedizierte Riicknah- 
mesysteme oder Produktnutzungskonzepte zu einer Erhohung der Riick- 
flussquote sowie zur Gewahrleistung eines okonomisch optimalen Riickfuh- 
rungszeitpunktes eingesetzt werden. Die Hersteller von Kopiergeraten set- 
zen hierbei Leasingvertrage ein. Pkws hingegen werden nach der Nut- 
zungsphase entweder exportiert (Systemverlust) oder weisen aufgrund 
ihres Alters eine Abnutzung auf, die ein Remanufacturing kaum konkre- 
tisieren lasst. 

Aufbauend auf diesen Parametern lasst sich folgendes Modell zur Bewer- 
tung von Remanufacturingstrategien formulieren. 

2 Modell 

Indizes 

c : Komponente bzw. Bauteil 

p : Produkt 

q : Qualitatsniveau 

t, t’ : Periode 





286 



Variablen 

Xp^t : Produktionsmenge von Produkt p in der Periode t. 

I3p^q : Marktanteil von Produkt p im Marktsegment q [0,1]. 

Nc^p^t • BeschafFungsmenge von neuen Komponenten c fur Produkt p 
in Periode t. 

RFc^t'.t ' Riickwartsfluss von Komponenten c in Periode t, 
die in Periode t’ eingefiihrt wurden. 

‘ Anzahl der Komponente c hergestellt in Periode t’, 
die im Produkt p wahrend Periode t wiedereingesetzt werden. 

: Deponierte Menge in Periode t von Komponente c 
hergestellt in Periode t’. 



Parameter 

TTp^t • Verkaufspreis von Produkt p in Periode t. 

: Beschaffungskosten von einer neuen Komponente c in Periode t. 

^ : Riickfiihrungskosten der Komponente c in Periode t 

hergestellt in Periode t’ (einschliefilich Transport und Demontage). 

^ : Deponiekosten fur Komponente c in period t’. 

^ : Herstellungskosten von Produkt p in Periode t. 

Dq^t : Nachfrage im Marktsegment q in Periode t. 

ac,p : Anzahl von Komponenten c im Produkt p. 

. Obere Grenze fiir den Marktanteil eines Produktes p 
im Segment q. 

Pp,t',t • Riickflusswahrscheinlichkeit von Produkt p nach t-t’ Perioden. 

^c,p,t',t • technische Reintegrationswahrscheinlichkeit von Komponente c 
im Produkt p nach t-t’ Perioden. 



Zielfunktion 



Max 



'^P,t ■ Xp^t ^C,t ■ ^C,p,t 

p,t c,p,t 



c,p,V ,t P,t 



N eb enb edingungen 



Q 

/3p.,<^pT (2) 





287 



T) 




(3) 


VI 

o 




(4) 


'■>P ’ “h ^ ^ 

V<t 




(5) 


^ ^ Ppit' ’ ^c,p * 
p 


Vc, t',t 


(6) 


~ ^ ^ Ac^p^j! H- 
p 




(7) 



Die Zielfunktion driickt das Bestreben des Herstellers aus, den Deckungs- 
beitrag iiber die gesamte Produktpalette hinweg zu maximieren. Nebenbedin- 
gung (1) stellt mittels des Marktsegmentierung-Paxameteis sicher, dass 
genugend Produkte Xp^t hergestellt werden, um die Nachfrage in den einzel- 
nen Marktsegmenten q zu decken. Mit dem Term (2) kann die psychologische 
Obsoleszenz beriicksichtigt werden. 

Term (4) definiert die Obergrenze fiir die Menge der wiedereinsetzba- 
ren Komponenten. Gleichung (5) gewahrleistet, dass ein Produkt p ent- 
weder aus neuen Bauteilen oder gebrauchten Bauteilen Ac^p^t'.t zu- 

sammengesetzt wird. Gleichung (6) bildet die Riickwartsfliisse anhand der 
Riickflussverteilung Pp^f.t ab, eine Auflosung auf Komponentenebene erfogt 
mittels ac,p. Auf Basis von Gleichung (7) kann die zu deponierende Mengen 
der Komponenten c in Periode t bestimmt werden. 

3 Fallbeispiel: Remanufacturing in der Pkw-Reifen 
Industrie 

3 . 1 Best andsaufnahme 

Jahrlich fallen in Deutschland 600 000 Tonnen Altreifen an [5], die nach 
dem 1996 in Kraft getretenen KrW / AbfG einer Verwertung zugefiihrt werden 
sollten. Potentielle Verwertungswege sind [2]: 

• energetische Verwertung (z.B. als Brennstoff in der Zementindustrie), 

• rohstoffliche Verwertung (Pyrolyse zur Riickgewinnung von Rohol), 

• werkstoffliche Verwertung (z.B. Einsatzstoff in der Bitumenherstellung), 

• Wiederverwendung der Karkasse zur Herstellung runderneuerter Reifen. 

Im Weiteren wird untersucht inwiefern der vierte Verwertungsweg, der 
nach okologischen Produktbewertungen [5] [2] vorteilhaft erscheint, auch oko- 
nomisch zu praferieren ist. In diesem Kontext ist anzumerken, dass Flug- 
zeugreifen bis zu zehn Zyklen erreichen, Lkw-Reifen eine Wiederverwendungs- 
quote von 50% aufweisen wahrend Karkassen von Pkw-Reifen nur zu etwa 
12% wiederverwendet werden. Hierbei wird zuerst auf die bereits beschriebe- 
nen Schliisselparameter eingegangen. 





288 



Marktsegmentierung: 

Grundsatzlich lasst sich der Markt fiir Pkw-Reifen grob in ein so genanntes 
Premium Segment und in ein Economy Segment unterteilen. Wahrend im 
Premium Segment neue Markenreifen gehandelt werden, teilen sich runder- 
neuerte Reifen mit Low-End Reifen das Economy Segment. 



Technologische Entwicklung: 

Ein Reifen besteht aus einer Karkasse und einem Laufstreifen. Aus technolo- 
gischer Sicht lassen sich Karkassen mit einer ausreichenden Wandst^ke die 
weniger als sechs Jahre alt sind und keine signifikante Beschadigung aufweisen 
wiedereinsetzen. Dies ist lediglich bei Karkassen aus dem Premium Segment 
der Fall. Eine Erhohung der Wandstarke ist allerdings mit zusatzlichen Ma- 
terialkosten in der Herstellungsphase verbunden und erhoht die Lebensdauer 
des Reifens in der ersten Nutzungsphase nicht notwendigerweise. Prinzipiell 
spielt die technologische Weiterentwicklung bei Karkassen eine untergeord- 
nete Rolle und ermoglicht somit einen Wiedereinsatz nach mehreren Jahren. 



Riickflussverteilung: 

Die Riickflussverteilung der verkauften Reifen hangt im wesentlichen von der 
Laufleistung sowie von klimatischen Bedingungen ab. So betragt der Erwar- 
tungswert der Nutzungsdauer in Gebieten, in denen Reifen halbjahrlich ge- 
fahren werden konnen, ca. sechs Jahre. 

3.2 Modellergebnisse 
Ausgangszenario 

Der Reifenhersteller bietet urspriinglich zwei Produkte an, Premium und Eco- 
nomy, die sich in den gleichnamigen Segmenten eindeutig zuweisen lassen. 
Die Segmentnachfrage fiir das Premium Segment wird doppelt so hoch als im 
Economy Segment angenommen. 



Einfiihrung eines runderneuerten Reifens 

Anschliefiend wird die Produktpalette um runderneuerte Reifen erweitert, 
welche sich im Economy Segment beflnden. Eine Verbesserung der Zielfunk- 
tion um 22% liegt darin begriindet, dass der Economy Segment aufgrund der 
Verwendung von alten Reifenkarkassen kostengiinstiger gedeckt wird. Die Mo- 
dellergebnisse, d.h. die ermittelte Remanufacturingquote, stimmen mit den 
empirisch festgestellten Werten iiberein. Die Griinde fiir die niedrige Wie- 
derverwendungsquote von 12% liegen in der spaten Riickfiihrung und der 
geringen Wiedereinsetzbarkeit von Karkassen. Vor diesem Hintergrund stellt 





289 



sich die Prage nach dem Einfluss einer Verbesserung der Karkasse hinsichtlich 
der Zielgrofien Deckungsbeitrag und Remanufacturingquote. Problematisch 
hierbei ist, dass der aus dem erhohten Materialeinsatz resultierende hohere 
Preis im Allgemeinen nicht an den Kunden des Premium Reifens weiterge- 
geben werden darf, da dieser ein fiir ihn gleichwertiges Produkt zu einem 
niedrigeren Preis erwerben kann. Ferner ist der Hersteller des okologischeren 
Premium Produktes nicht in der Lage sicher zu stellen, dass die Karkasse 
auch wieder zu ihm kommt, um, wie im Modell unterstellt, an einer zweiten 
Nutzungsphase im Segment der runderneuerten Reifen zu verdienen. Dabei 
stellen Pfand- bzw. Produktnutzungssysteme potentielle Auswege aus dieser 
Problematik dar. 



Optimierung des Riickfuhrungszeitpunktes 

Ein weiterer Ansatzpunkt zur Erhohung der Remanufacturingquote liegt, 
zumindest theoretisch, in der Vorverlegung des Riickfuhrungszeitpunktes. 
Ein entsprechendes Vorgehen fiihrt im Modell zu einer Steigerung der po- 
tentiell wiederverwendbaren Karkassen. Bei den Modellberechnungen ergab 
sich jedoch ein um 9% erniedrigter Deckungsbeitrag. Die Verschlechterung 
des Ergebnisses liegt im Karkasseniiberschufi, der aufgrund einer begrenzten 
Kundennachfrage nach runderneuerten Reifen nicht wiederverwendet wer- 
den kann. Die sprungartige Entwicklung der Entsorgungskosten im Beobach- 
tungszeitraum (+100%) findet ihre Urspriinge in der gestiegenen Menge an 
iiberflussigen Altreifen, die trotz eines positiven S einer Entsorgung zugefiihrt 
werden miissen. 



Literatur 

1. Bellmann, K. (1990) Langlebige Gebrauchsgiiter. DUV, Wiesbaden 

2. Ferrer, J. S. (1997) The economics of tire remanufacturing. Resources, Conser- 
vation, Recycling. 19 , 221-255 

3. Fleischmann, M. (2001) Quantitative models for reverse logistics. Springer, Ber- 
lin Heidelberg 

4. Jayaxaman, V., Guide Jr, V. D. R, Srivastava, R. (1999) A closed-loop logistics 
model for remanufacturing. Journal of the Operational Research Society 50 , 
497-508 

5. Umweltbundesamt (1999) Okologische Bilanzen in der Abfallwirtschaft. Umwelt- 
bundesamt, Berlin 





Environmental Coordination of Supply Chain 
Networks Based on a Multi-Agent System 



Axel Tuma, Jurgen Friedl 

Department of Economics, 
University of Augsburg, 
UniversitatsstraBe 16, 
86135 Augsburg, Germany 



Abstract. Following the actual discussion concerning modem production con- 
cepts, an increasing trend to network organizations can be identified. In intercon- 
nected production systems materials and different forms of energy are provided, 
converted, stored and transported. Environmental impacts can be identified at any 
production unit. In this context the allocation of workload to the different produc- 
tion units, considering simultaneously both ecological and economical goals, is of 
special interest. In this way it is possible to use available resources more effi- 
ciently and to reduce emissions and by-products caused by the production process. 
To achieve this goal a model of the production system has to be created. This 
model has to include the input and output streams. Due to the modelling require- 
ments of supply chain networks (modelling of more or less independent produc- 
tion units, dynamic behavior) an agent-oriented simulation seems to be adequate. 
Secondly evaluation methods for the modelling of the economic and ecological 
goals have to be integrated. In order to address the multi-criteria stmcture of such 
a decision problem, a goal programming approach is discussed. 



1. Agent-Oriented Simulation 

The fusion of simulation models and agent-oriented modelling techniques is an 
important step towards the examination and modelling of highly complex, real 
world applications. Agent-Oriented Simulation embodies the power of both meth- 
ods: Firstly simulation is a perfect technology for modelling and analyzing com- 
plex phenomena; secondly agent-oriented technologies give support towards de- 
centralization on the one hand and ’’cognitive support”, often missed in former 
simulation systems, on the other hand. 




291 



The advantages of using AOS techniques are: 

• Naturalness: Modelling many collaborative agents is far more natural and can 
be better controlled than building one big monolithic system. 

• Robustness: Faulty parts can be easily identified and exchanged and extension 
and modification of the system cause little update operations. 

• Distribution: Since there is a clear logical partition, the physical distribution of 
the system to different machines is straight-forward. This implies a realization 
of distributed real-time simulation. 

What are the (multi-)agent technologies involved in an agent problem solving 
process? First of all there is the agent itself From a general point of view there are 
two schools: the reactive school which supports an efficient real-time behaviour of 
the system by a fast acting situation-response procedure, and the representational 
or deliberative school which supports an explicit knowledge representation and 
reasoning. The decision on one or the other school, or on a hybrid version influ- 
ences the agent architecture. No matter which school is supported, the agent must 
be able to communicate and to act on the basis of his world knowledge. Therefore 
are modelling communication and the individual knowledge base, including sen- 
soric and actoric fimctionalities, primary agent technology decisions. Secondary 
technology decisions concern planning, local conflict management, and basic 
learning abilities. Tertiary technology decisions cover joint planning, global con- 
flict resolution, partner modelling, and agent centred learning. 

The transition from one agent to a society of agents raises specific design prob- 
lems. The functional interaction (as von Bechtholtsheim [1] called it) among the 
agents has to be modelled. The functional interaction influences every decision of 
the single agent model by adapting more complex mechanisms. For example, the 
communication might need language primitives and protocols for complex nego- 
tiations in addition to simple REQUEST-REPLY or SEND-RECEIVE patterns. 
There must be a qualitative step to extend single agent planning to the generation 
of joint plans including activities of different agents. For this reason explicit coor- 
dination mechanisms have to be used for the design of the agent society. 

Instead of deepening the AOS discussion (look up [2] for more details) the ba- 
sic structures of agents representing the production units of the network will be 
discussed in the following. 



2. Configuration of the Production Agents 

According to the requirements of the allocation tasks the agents have to be de- 
fined, their knowledge and behaviour have to be characterized and their coordina- 
tion mechanisms have to be described. The agents will be described in an imita- 
tion of the CommonKADS knowledge engineering model [2]. 





292 



PU-agents: (Production Units) 

• have the task to increase their own marginal income, respectively to decrease 
the emissions and by-products caused by the production process; 

• perform the actions: Produce and Consume; 

• communicate with their direct neighbourhood by a) sending resource/job- 
requests (need more material) and capacity-requests (need less material/offer 
jobs) b) receiving push-requests and pull-requests, and c) negotiating on the 
amount (and possibly the time restrictions) of material flow; 

• have knowledge about their a) current satisfactory state b) productivity and c) 
Producing/Consuming relationships and needs; 

• plan in order to forecast their local situation and plan the communication. The 
planning process includes the reasoning capabilities of the agents. 

Each agent acts essentially on the basis of its state of satisfaction which reflects 
the fulfilment degree of its task. If an agent is not satisfied, it communicates its 
needs to the neighbours in order to increase or decrease its workload. The agent’s 
ability to negotiate enables each agent to partially accept the requests and post- 
pone the necessary activities in order to fit into the currently calculated plans. In 
the following the satisfaction calculation will be discussed in detail. Negotiation 
principles are discussed in [3]. 



3. Calculation of the Satisfaction Levei of the PU-Agents 

The basic idea for the construction of the satisfaction level for the single PU- 
Agent is the evaluation of the production quantity taking into account economical 
and ecological criteria. In this context the PU-Agent will be modelled according to 
the Gutenberg production function [4]. Based on intensity and production time, 
unit costs, respectively marginal income, and emissions are calculated (Fig. 1). 




Intensity [U/PERl in % 

•■"■■■“marginal income per unit 
emissions per unit 



Fig. 1. Marginal Income and Emissions per Units 





293 



In order to find an evaluation value for the possible production quantities the 
multi-criteria character of the problem has to be addressed. In principle this can be 
done via different methods like scoring-models, goal programming and aspiration 
level approaches. 

In the following a goal programming approach will be explained in more detail 
(Fig. 2). 



MaxMI(d)X 




GOAL PROGRAMMING 


MinEM(d)X 


Max 


+ k ’ a2 


(1) 




(1) 


MI{d)^X ~ a, = MI 


( 2 ) 


EM(d)X<EM^ 


(2) 


EM id)- X + = EM ^ 






(3) 


d^^ < d < d^^ 


( 3 ) 


d <d <d 

min max 


(4) 


d • ^ • < X < d - 

mm max 


( 4 ) 


d ' t X < d ' t 

‘^min *^max 


(5) 


dfi , «2 - ^ 




d = intensity [U/PER] 




k = key parameter [MU/EU] 




Ml = marginal income function [MU/U] s 1 


EM = emission function [EU/U] 




EMmax - legal emission standard [EU] 




X = (fixed) production quantity [U] 




MImin = enterprise specific minimal marginal income [MU] 




s dmin/dmax = enterprise specific minimal/maximal intensity [U/PER] 




tmin/tmax = enterprise specific minimal/maximal flexible time per period [PER] 



Fig. 2. Goal Programming Approach 



The key parameter of the evaluation process k has to be calculated first. For 
k=0 the evaluation process will only focus the economical point of view. The eco- 
logical aspects will not be considered. In contrast, for a k^oo the economical 
evaluation part will be neglected. In this context the model for k=0 represents ac- 
cording to Hansmann [5] a reactive attitude of the production unit (for k-»oo a pro- 
active attitude). In order to reflect the behaviour between the two poles proactive 
and reactive, adequate levels of the parameter k have to be specified. In this con- 
text a so-called minimal and maximal emission level will be calculated. The 
“minimal emission level” EMUmm respectively the “maximal emission level” 
EMUmax represents the minimal (maximal) emissions per unit considering the in- 
tensity and time constraints (Fig 1.). In the next step the difference between 
EMUmax and EMUmin is divided in n equidistant intervals. For each of the intervals 
i=l,. . .n the corresponding value kj (related to the maximal value of the interval) is 
calculated according to the following equation: 

EM(d,;) = EMU^^ +{EMU^^ -EMU^J*- i = (3.1) 

n 

where d^i* represents the optimal solution of the goal program of Fig 2. Under the 
assumption that EM(d)=afd^+bi d+Ci and MI(d)= a 2 *dVb 2 'd+c 2 , ki can be calcu- 
lated (independently of X) by equation 3.2. 





294 



EM -'[EM ^ +{EM ^ - EM 2a, + 6, 

k, = f (3.2) 

EM -'[EM ^ +(EM ^ _ £M • 2a, + b. 



Fig. 3 shows an evaluation function for production quantities of a production unit 
for parameter kg=0,l and k4=0,5 (n=10)^ 




production quantity [U] 




production quantity [U] 



Fig. 3. Evaluation Values for Different Key Parameters 



4. Exemplary Negotiation Process 

To illustrate the negotiation process a small example consisting of two production 
units (PUl and PU2) will be considered. Fig. 4a-b shows the evaluation functions 
of the two production units (with identical key parameter k) including the actual 
state of the production (underlined in Fig. 4a-b). Assuming that the production 
state of PUl is below a given satisfaction level a of e.g. 85% (of the difference be- 
tween maximal and minimal evaluation value), a negotiation process will be initi- 
ated by this production unit. This means a job-request will be send from PUl to 
PU2. In the following the evaluation function of PU2 will be calculated as a func- 
tion of potential production quantities of PUl (Fig. 4c). To determine the new al- 
location of workload between PUl and PU2 the sum of the evaluation functions of 




Fig. 4a-4b. Evaluation Values and Satisfaction Level 
Fig. 4c is calculated (Fig. 4d). The result (production quantity of PUl) of the ne- 
gotiating process is determined by the maximal value of the aggregation function 
in Fig. 4d. The production quantity of PU2 can be computed accordingly to this. 



^ Within the dashed part the emission per unit due to the equation 3.2 can not be guaran- 
teed. 





295 




5. Conclusion 

The described approach shows in principle that the allocation of workload in a 
production network can be based on an agent-oriented simulation system. To de- 
scribe the single agents the CommonKADS knowledge engineering model seems 
to be well suited. In order to model the multi-criteria goal functions of the agents a 
goal programming approach can be applied. In addition the different behaviour be- 
tween the poles, proactive and reactive, can be modelled. Further research work 
will focus on different cooperation relations in supply chains as well as on the 
evaluation of different strategies for the simultaneous addressing of economical 
and ecological goals. Furthermore the investigation of the dynamic behaviour of a 
whole production network seems to be interesting. In this context the potential 
learning capabilities of the single agents seem to be crucial. 



REFERENCES 

[1] V. Bechtoltsheim, M.: Agentensysteme; Vieweg Verlag, 1993 

[2] Muller, H.J.: Towards Agent Systems Engineering, Data & Knowledge Engineering, 
Vol. 23, pp. 217-245, 1997 

[3] Muller, H.J.: Negotiation Principles, IN: Foundations of Distributed Artificial Intelli- 
gence, O' Hare, G.M.P. and Jennings, N.R. (Eds), pp. 21 1-230, Wiley Intersience Pub, 
1996 

[4] Gutenberg, E.: Grundlagen der Betriebswirtschaftslehre, 1. Band, Die Produktion, 16. 
Auflage, pp. 314-325, Springer- Verlag, 1969 

[5] Hansmann, K.W.: Umweltorientierte Betriebswirtschaftslehre, pp. 10-12, Gabler, 1998 





Decision Support for the Nationai Impiementation 
of Emission Reduction Measures by the Dynamic 
Mass Fiow Optimisation Modei ARGUS 



Jutta Geldermann, Nurten Avci, Stefan Wenzel, Otto Rentz 

French-German Institute for Environmental Research (DFIU / IFAREIIP) 
University of Karlsruhe (TH), Hertzstr. 16, D-76187 Karlsruhe, Germany 
E-mail: {first name.last name}@wiwi.uni-karlsruhe.de; 
http://www-iip.wiwi,uni-karlsruhe.de/-'Voc/ 



Abstract. A large number of industrial sectors is affected by enforced environ- 
mental requirements, and therefore a cost-optimal transformation is essential for 
the economic and environmental efficiency both of companies and regions. 
Techno-economic dynamic optimisation models - like ARGUS - have been de- 
veloped and implemented for various European Countries. The resulting cost 
functions for various scenarios are essential for the multi-national allocation of 
emission reductions. Cost-discounting effects and the temporal pathway of the 
implementation of emission reduction options within a given planning horizon (up 
to 2020) are considered. The results can be used to analyse the cost-effectiveness 
of environmental legislation and for emission projections and inventories. 



1 Introduction 

The availability of reliable cost functions for the considered pollutants (SO 2 , 
NO^, VOC, NH^) and countries (UN/ECE) is an important prerequisite for the im- 
plementation of the critical load/level approach based on integrated assessment 
modelling [1; 10]. Beside critical loads/levels and parameters describing the at- 
mospheric transport and transformation of pollutants, cost functions have an im- 
portant influence on the national emission ceilings, which have been set recently 
by the European NEC-Directive (81/2001/EC). Thus, the EU Member States have 
the obligation to draw up programmes to ensure the compliance with these limits 
by 2010 at the latest, in order to adopt the emission limits negotiated in the Goth- 
enburg Protocol of the UN/ECE to abate acidification, eutrophication and tropo- 
spheric ozone (also called “multi-pollutant and multi-effect Protocol”)- Other ap- 
proaches to reach the environmental targets set by the Gothenburg Protocol are the 
EU-Solvent Directive (13/99/EC) and the IPPC-Directive (taking into account all 
environmental effects for integrated pollution prevention and control; 61/96/EEC). 
The ‘best available techniques’ (BAT) play an essential role in the actual realisa- 
tion of these regulations. 

A large number of industrial sectors is affected by these enforced environ- 
mental requirements, and therefore a cost-optimal transformation is essential for 
the economic and environmental efficiency both of companies and regions. There 




297 



fore, cost functions have to be as accurate as necessary. This means in particular 
that all relevant sources and all available emission reduction options have to be 
taken into account, including technical pollution abatement measures but also 
structural options (like autonomous technology change during the planning pe- 
riod), which often show an important cost saving potential, especially in countries 
with economies in transition. 

Dynamic energy and mass flow optimisation models have been developed to 
assess emission reduction strategies on the national level. They are based on a de- 
tailed representation of all relevant emission processes and corresponding emis- 
sion reduction options, including structural options. They provide the "cost opti- 
mal" evolution of the production system (production technologies and abatement 
options in place) over a given planning horizon (up to 2020) which allows for the 
achievement of emission reduction ceilings and the fulfilment of the demand of 
products or services specified exogenously on the sectoral level. This contribution 
explains the underlying principle of dynamic energy and mass flow model AR- 
GUS (Allocation module for a computer aided generation of environmental strate- 
gies for emissions), which has been developed for the elaboration of emission re- 
duction strategies for VOC from stationary sources on a national level [7, 6; 1 1]. 



2 Methodological Background of the Dynamic Mass 
Flow Optimisation Model ARGUS 

The (quasi-) dynamic mass-flow optimisation model ARGUS is based on a de- 
tailed representation of all relevant stationary VOC emission sources and the cor- 
responding applicable emission reduction options. The optimisation criterion is the 
minimisation of the sum of the discounted costs over the considered planning ho- 
rizon. Emission sources and abatement options are described in terms of reference 
installations defined by the UN/ECE Task Forces on abatement options/techniques 
for VOC [4; 9]. About 2000 emission relevant processes within about 40 industrial 
sectors (like vehicle coating or production of paints and adhesives) are modelled. 
The input data are structured as follows: 

■ Technological data sheets specify the emission factor e^ j j , the investment 

i j and the operating costs of an emission reduction option {s,ij) appli- 
cable to a reference installation i of sector s, 

■ In contrast, the country data sheets are used to characterise the time dependent 
and country specific structure of emission sources in terms of sectoral activi- 
ties j the shares of the different reference installation to the sec- 
toral activities (market shares) and the implementation rate ys^jj of 
the emission reduction options (share of the option (s,i,j) to the sectoral activ- 
ity of reference installation (s,i) in year t. 

This distinction of the data allows for an wide applicability in different count- 
ries, since the technological data can be used as default values and can be adjusted 
to the current needs, with most necessary modifications regarding the country spe- 
cific data. 

The model determines the evolution of the structure of emission sources and 
abatement options for planning horizon from 1995 (base year) to 2015. The opti- 





298 



misation variables are the implantation shares ys^jj different emission 

reduction options. The evolution of sectoral activities and market shares of refe- 
rence installations are specified exogenously and are taken from the country data 
sheets. The target function is given by 

^ = ■fc.'-.r Eq.(l) 

s t i j 

The first term placed within brackets denotes the investment related expenditu- 
res due an increase AQ^ j of the capacity of emission reduction option (s,ij) in 
the year t. j j is the annualised specific investment per unit of capacity of op- 
tion (s,ij)- The second term represents the operating costs of option (s,ij) which is 
obtained by multiplying the annual activity Psjjj of installations equipped with 
this option and the specific annual costs per unit of activity OC^j j . All these pa- 
rameters can be expressed as function of the optimisation variables and the pa- 
rameters specified in the technological and country data sheets [7; 6; 2]. 

Thus, the decision (optimisation) variables are the capacities of the implemen- 
ted production or emission reduction technologies for each year. The optimisation 
is performed by taking into account restrictions concerning inter alia the fulfilment 
of products or services demand and the limitation of emissions. The total VOC 

emissions to be considered in constraint on an emission ceiling for each 

period t are given by 

= Eq.(2) 

5 / j 

where j j is the efficiency of reduction option (s,ij). In practice, as the time 

delay for the implementation of the Protocol obligation is relatively long, measu- 
res requiring important transformations of existing installations will only be 
implemented for new installations. The implementation potential of such options 
is controlled by the changes in sectoral activities (e.g. increase of production capa- 
city) and the capacities of installations reaching the end of their lifetime in a given 
time period (renewal of installations). The remaining capacities of existing instal- 
lations are calculated from the age distribution and lifetime of installation using a 
lifetime model included into ARGUS. 

A cost function for the target year is calculated in the following way: In a 
first step the target function is minimised without any emission ceiling. Let denote 
the corresponding emission value for the target year. In a second step optimi- 
sation runs are performed for a set of decreasing values of emission ceilings 
E^^ (constraints according to Eq. 2) in the range from up to the minimum 
feasible value. Moreover, intermediate targets can be defined in scenarios, in order 
to analyse the influence of the implementation delay of emission reduction obliga- 
tions on cost functions. Figure 1 exemplarily shows the cost functions for Ger- 
many [7] corresponding to the different implementation scenarios. Similar results 
have been obtained for France [6; 11]. 





299 




Figure 1: Costs functions for VOC emission sources from stationary sources in Ger- 
many for the different implementation scenarios (planning horizon = 2015, tar- 
get year =2010, interest rate i = 0% 



3 Uncertainties and variabiiity in Dynamic Mass Fiow 
Optimisation Modeis 

Estimates and long-term forecasts obviously are, by their very nature, uncertain: A 
strategy considered optimal on the basis of particular assumptions made today is 
highly unlikely to turn out optimal in the actual situation of 2010 [13]. In this con- 
text, the following types of uncertainty and variability can be distinguished [12]: 

■ Parameter uncertainty, such as a lack of knowledge about emission estimates 
per unit or environmental degradation rates; 

■ Model uncertainty, like the assumption of linear relationships between the 
mass and energy flows, or the assumption that substances are physically and 
chemically homogeneous; 

■ Uncertainty due to choices, such as the choice between several allocation 
methods, if delimitation problems occur with regard to the assignment of the 
costs for emission abatement and those incurred by other purposes (e.g. indus- 
trial health and safety standards); or the choice of weighting factors, if several 
harmful substances have to be considered simultaneously; 

■ Spatial variability, like regional differences in emission estimates, or varying 
sensitivity of the local environment; 

■ Temporal variability, e.g. differences in yearly emissions or variability 
caused by seasonal changes; 

■ Variability between sources, e.g. different techno-economic characteristics of 
the relevant production plants. 

So far, most applied Operations Research approaches concentrate on the treat- 
ment of parameter uncertainty, starting from basics like the request for data for- 
mats allowing for the characterisation of the data quality. Imprecise data might be 





300 



modelled by use of Interval Arithmetic or Fuzzy Logic [3]. More advanced ap- 
proaches cover the treatment of subjective uncertainty estimates, e.g. by Bayesian 
Statistics, while stochastic modelling describes parameters as uncertainty distribu- 
tions, including Monte Carlo or Latin Hypercube simulation [14]. Given the size 
of the dynamic mass flow optimisation models, however, it seems questionable, if 
a comprehensive analysis of the model’s sensitivity can be reached. Here, the re- 
duction of the model size might open the paths for further investigations. 



4 Conclusions 

Techno-economic dynamic energy and mass flow optimisation models for the 
determination of national cost functions take into account full sets of emission re- 
duction options including structural options related to changes in sectoral activities 
and production technologies. While ARGUS has been developed for the analysis 
of VOC-emission reduction strategies from a large number of stationary sources, 
e.g. PERSEUS has been widely used to optimise energy systems including the as- 
pects of resources and emissions (SO 2 , NO^, and C02). Further research is now 
necessary on the integration of these models for the analysis of combined emission 
reduction measures. For this cross-media assessment, only the analysis of a dis- 
crete set of scenarios seems to be feasible up to now, applying MADM methods 
(cf. [3]). If the discussion of Life Cycle Assessment (LCA) or sustainability indi- 
cators can provide useful input for dynamic mass flow optimisation models, is 
currently an open issue. 

A main factor limiting the accuracy of the required data, however, is the avail- 
ability and quality of input data, in particular with regard to the structure of emis- 
sion sources (statistics on activities, information on plants and applied processes, 
e.g. size and age distribution of installations, control options already in place). Es- 
pecially the modelling of smaller regions with few installations of similar type re- 
quires modified approaches [8]. Further research in the field of Operations Re- 
search seems to be necessary for more advanced sensitivity analyses within 
complex techno-economic dynamic energy and mass flow optimisation models. 



5 Literature 

[1] Alcamo J., Shaw R., Hordijk L. (eds): The RAINS Model of Acidification. Science 

and Strategies in Europe. Kluwer, Dordrecht, 1990 

[2] Brooke, A.; Kendrick, D.; Meeraus, A.: GAMS, The Scientific Press, 1988 

[3] Geldermann, J., Rentz, O.: Entwicklung eines multikriteriellen Entscheidungsunter- 

stiitzungssy stems zur integrierten Technikbewertung. In: Operations Research 
Proceedings 2000, S.445-451, Springer, Berlin (2001) 

[4] Nunge, S.; Geldermann, J.; Rentz, O.: Der Einsatz der Clusteranalyse zur Definition 

von Referenzanlagen. In: Operations Research Proceedings 2001, SOR 2001, 
Duisburg, Springer- Verlag, Berlin (2002) 

[5] Rentz, O.; S. Nunge, U. Karl, T. Holtmann, T. Zundel: Feasibility Study on the De- 

velopment of a Design for an Emission Projection Model Based on the CORI- 
N AIR- Approach, Final report, December 1999 





301 



[6] Rentz, O.; Avci N.; et al.: Elaboration de fonctions de cout pour la reduction des 

emissions de composes organiques volatils pour la France, final report, research 
project supported by ADEME, Karlsruhe, 1999 

[7] Rentz, O.; Laforsch, M.; Holtmann, T.; Nunge, S.; Zundel, T.; Avci, N.: Technolo- 

gies, measures and costs for the avoidance of VOC emissions as a basis for in- 
tegrated assessment models in the framework of the UN/ECE, final report on- 
behalf of UB A, Berlin, 1998 

[8] Rentz, O.; Wenzel, S.; Avci, N.; Geldermann, J.; Joas, R.; Schott, R.: Proposal for a 

Scheme of Action for the Reduction of the Ozone-Precursors NO^ and VOC in 
Austria until 2010. On behalf of the Ministries of the Environment and of Eco- 
nomics, Vienna, 2002 

[9] Rentz, O.; Nunge, S.; Laforsch, M.; Holtmann, T.: BAT Background Document of 

the Task Force on the Assessment of Abatement Options/techniques for Volatile 
Organic Compounds, September 1 999 

[10] Wietschel M.; Dreher M.; Fichtner W.: Goebelt M.; Rentz. O., PERSEUS: State of 

Model Development, Applications and Perspectives, in: Hake, J.-Fr., 
Markewitz, P. (ed..): Modellinstrumente fur CO,-Minderungsstrategien, Jiilich, 
1997, p. 223-240 

[11] Zundel, T.; Fichtner, W.; Avci, N.; Frank, M.; Rentz, O.: Dynamic energy and mass 

flow optimisation models for the elaboration of cost functions for VOC and NO^ 
emission reduction: methodological issues, in: Pollution Atmospherique, Nume- 
ro special "Atelier d' Angers", 2000 

[12] Huijbregts, M.A.J.: Uncertainty and variability in environmental life-cycle assess- 

ment, Amsterdam, 2001 

[13] Landrieu, G.; Mudgal; S.: Commentaries on the results of studies requested from II- 

ASA by the French Ministry for Land-Use Planning and the Environment. Task 
Force on Integrated Assessment Modelling, 25th session, Stockholm, 2000. 

[14] Insua, D.R,; Martin, J; Proll, L; French, S; Salhi, A.: Sensitivity Analysis in Statisti- 

cal Decision Theory: A Decision Analytic View. In: Journal of Statistical Com- 
putation and Simulation, 1997, Vol. 57, 1-4, p. 197-218 





Fuzzy Scheduling for the Dismantling of Complex 
Products 



Frank Schultmann and Otto Rentz 

University of Karlsruhe, Institute for Industrial Production, Hertzstrasse 16, 
D-76 187 Karlsruhe 

E-mail: {frank.schultmann; otto.rentz}@wiwi.uni-karlsruhe.de 



Abstract. In this paper, an approach is presented for the solution of decision- 
making problems in short-term dismantling planning. Weak data resulting from 
uncertain dismantling time parameters are modeled as fuzzy sets and integrated 
into a scheduling model that supports deriving optimal work plans. Case studies 
from the construction industry and the electronic scrap recycling show the applica- 
tion of the approach. 



1 Introduction 

Due to extended producer responsibility and stricter environmental regulations an 
increasing amount of spent products, e.g. electronic equipment, vehicles or build- 
ings, will have to be dismantled in the near future. Consequently, the management 
of dismantling and the subsequent recycling cannot not only focus on the genuine 
aim of profitability, but will also have to meet criteria of sustainability, e.g. limi- 
tating the discharge of pollutants to the environment. To anticipate an integrated 
dismantling and recycling procedure, numerous different objectives in environ- 
mental, technical and economic means have to be taken into account. Thus, alter- 
native scenarios for the end-of-life treatment need to be considered on strategic, 
tactical and operational level. While objectives in short-term planning are usually 
modeled using extended time-based objective functions, different alternatives to 
meet certain targets in the field of sustainability can be modeled using multiple 
modes. Mathematically, both aspects can be realized by using project-scheduling 
models. However, even when using multiple modes, one major problem in model- 
ing dismantling processes is the fact that data is not always available to put 
mathematical models at work. In particular, the duration of dismantling tasks is 
seldom precisely known due to uncertainties in the composition of returned mate- 
rials. The use of fuzzy sets proves to be a powerful tool for modeling this weak 
data. Fuzzy techniques have been well established in mathematical theory; how- 
ever, fuzzy programming is still hardly used for dismantling tasks. 



2 A fuzzy-scheduling model for dismantling 

The well-known multi-mode resource-constrained project scheduling problem 
(MRCPSP) is used as a basis for the following model formulations. For a survey 
on modeling concepts as well as scheduling algorithms for resource-constrained 
project scheduling see [1], for modeling dismantling problems we refer to [2]. 



Operations Research Proceedings 
© Springer- Verlag Heidelberg 2003 



© Springer- Verlag Berlin Heidelberg 2003 




303 



2.1 Planning of dismantling structures 

Dismantling planning aims first at setting up a technology oriented order of the 
dismantling activities to be carried out. The technological precedence relations of 
the dismantling process can be illustrated by an topologically ordered activity-on- 
node network, where the nodes represent the dismantling activities j and 

the arcs indicate the precedence relations between these activities. Regarding the 
model that will be formulated later, the network contains one imique source (j=l) 
and one unique sink (j=J). This can always be guaranteed by introducing a 
dummy source and a dummy sink, respectively. For examples, we refer to [2]. 
Usually each activity can be processed in different ways e.g. using different tech- 
niques that can be expressed by different resources. Moreover, each different 
technique may result in different processing times. Several alternatives for carry- 
ing out a job can be modeled by introducing different modes m (m=l, 

The scheduling model contains resource categories. Activity j in mode m is as- 
sociated with the usage of renewable resources and the consumption of nonrenew- 
able resources. While renewable resources (e.g. machines, workers) are only con- 
strained on a periodical basis (possibly varying from period to period), i.e. after an 
activity j is accomplished, the renewable resources used by j are available to proc- 
ess another activity. Nonrenewable resources (e.g. financial budget) are limited on 
the basis of the project’s entire duration. Consequently, the consumption of a non- 
renewable resource by activity j reduces its availability for the rest of the project. 
For simplification purposes we will use renewable resources only. Variables for 
resource usage as well as for constraints are introduced as follows: 

q.j. capacity of renewable resource r, used by dismantling activity j being 
performed in mode m for each period the activity is in process and 
Q^;. capacity of renewable resource r, reR, available in period t. 

2.2 Modeling of uncertain time parameters 

Performing activity j in mode m has a nonpreemtable processing time, also re- 
ferred to as duration, of dj^ periods. Due to the uncertainties of these processing 
times, resulting e.g. from different conditions of the discarded products, time pa- 
rameters are modeled as fuzzy sets. 

The expectations of a decision-maker concerning fuzzy time parameters like 
the processing time or the start and finish times of an activity j in the mode m 
(m=l, can be modeled as a fuzzy set, where set ^ in a base set X can be de- 
scribed by a membership fiinction : X [0,1] with jU^ix) = 1 if xeA and 
(x) = 0 if If it is uncertain, whether element x belongs to set A, the model 

has to be extended in a way that maps into interval [0,1] with its value repre- 
senting a membership possibility between poor (low value) and likely (high 
value). This leads to the definition of a fuzzy set [3]: 

A = {(x,//;j(x)) I X G Z} , with the membership function [0,1] (1) 





304 



Other definitions needed are the fuzzy number which is a normalized convex fuzzy 
subset of IR with exactly one x^^elR such that /i^(x^)=l; and a fuzzy interval 
which, contrary to a fuzzy number, can comprise more than one xelR such that 
= 1 . Since the precise form of a fuzzy number can rarely be described by 

an expert in reality, a practicable way of getting suitable member functions of 
fuzzy data can be obtained if the expert expresses optimistic and pessimistic in- 
formation about parameter uncertainty on some prominent membership levels a 
for each mode m in which activity j could be processed. The a-level set of the 
fuzzy interval (duration) D.^ is defined as: 



iDj = 6 1R: 1 ^ «} , with a e [0,1] (2) 

In this case, the optimistic (pessimistic) value for the duration of a dismantling 
acitvity j performed in mode m on level a then represents the lower (upper) bound 
of the corresponding a-level set of the fuzzy interval D.^ . If L levels 

or = (7j , . . . , (T^ € ]0, 1 [, Lg 7A^ are introduced, the duration can be expressed as: 



[ d \ d ] if a = l, (3) 

[d - S“ ; d + S" ] a a = < 7 „..., a, e]0,l[,LeIN (4) 

The variables S'" ( S" ) denote the expert’s assessment for the acceleration 

(prolongation) of the optimistic duration jdj^ (pessimistic duration for activity 
j in mode m on level or, with ^d.^ and ^dj^ fixed on level or=/. Apart from this 
particular case (or=l), two additional levels (a=e=0,l; a=X=0,6) are considered 
in the following, according to [3], with: 

!^b (^jm) - 1 indicates that value dj^ certainly belongs to the set of possible 
values, 

jU^ {d.f) > X indicates that value dj^ with jjL.^{d.f) > X has a good chance of 
belonging to the set of possible values, 

!^b ^ ^ indicates that the expert estimastes that value d.^ with 



fij^{d .^) < € has only a very poor chance of belonging to the set 



of possible values. 



With d^^d 



: 

? jm ■ 



,dl=d.+ Sl-, (ae{l,A,f}), ^isjdj can be 



approximied by combining linear functions section- wise (cf Fig. 1). 





305 




Fig. 1. Membership function of the duration D.^ 



A fuzzy interval that can be depicted by a membership function as shown in Fig- 
ure 1 can be transformed in a six-point convention such that it is represented by 
six real numbers (cf. [4]): 



D =( d' , d\ 

jm '^o jm^ o jm^ 



d\ 



jm ^ p jm 



d' , 



d‘ ), 

p m ^ 



j^l..J,m=K..Mj (5) 



The fuzzy arithmetics that can be applied for such six-point representations are 
shown in [5]. 



3 Model formulation 

With reference to the resource-constrained project scheduling problem introduced 
by Pritsker et al. [6], scheduling for dismantling purposes can be formulated as a 
binary linear program with the decision variables (dismantling activity j is per- 
formed in mode m and completed in period t). As proposed by Hapke et al. [7] this 
problem can be used for fuzzy scheduling as well. In the following, we assume 
that the earliest and latest finishing times, the earliest and latest starting times as 
well as the durations of the dismantling activities are given as fuzzy intervals in 
six-point representations. Using the expert’s estimates for optimistic (o) and 
pessimistic (p) durations on the three levels (X=\,X,£, the fuzzy-model can be 
transformed into six crisp resource-constrained project scheduling problems. 

In order to reduce the number of variables in the programming formulation, fuzzy 
time windows can be calculated using a modified critical path analysis. Critical 
path analysis requires an upper bound T for the makespan of the project. It 
should also be noted that the unique source (/=1) and the unique sink (/=J) have 
comprise zero duration, zero resource usage and consumption, respectively. The 

earliest and lastest starting and finishing times ^ ,,EFj , ^ 

ke {o,p} , ae {I, X, €} for the dismantling activities j=\,...J can be calculated as: 



^ES:=Q 


ke {o,p} , ae {l,/l,e} , 


(6) 


.EF:=d: 


ke {o,p } , ae 


(7) 


,ES‘‘=max{,ES:+,d: 


\ieP.} j = , ke {o,p} , 

ae {l,A,e} , 


(8) 





306 



j = 2,...,J , ke {o,p} , a e {l,A,e} , (9) 

= ke{o,p},ae{l,A,e}, (10) 

j=l m=l 

ke{o,p),aG:{\,X,e}, (11) 

,L/<“=min{,IF“- j;,|7€5,} i = \, ..., J , ke {o,p} , (12) 

a 6 {l,A,f}, 



^LS^=^LF;-X i = ke{o,p},ae{U,e}, (13) 

where the modes of the activities are labeled with respect to non-decreasing 
duration for the levels a = l, A,e and for the optimistic and pessimistic expert’s 

assessments. Based on these time windows calculated according to (6)... (13), six 
crisp scheduling problems can be formulated as follows: 



Minimize 



w — 1 / =z ^ EFj 

Subject to 


ke {o,p} ,ae {l,X,e} 


(14) 


I 7 = 1, 


ke {o,p} ,ae {l,A,e} 


(15) 


I I t 2 1 I (t- 


-,d") x , 7 = 2,...,/, ieP , 
ke {o,p} ,ae {l,X,e} 


(16) 


i'Lq X X. <Q, 

J- jmr ^ jmr ^rt 

j-\m = \ T = t 


reR, 1 = 1,..., 
ke {o,p} ,ae {\,X,e} 


(17) 



^y«e{0,l} 7 = 1,...,/, m = (18) 

t=,EF;,..„ ^LF\ ke {o,p},ae {l,A,e} 

The problem (14)... (18) can be solved by using exact or heuristic approaches. 
Here, the model is solved by a branch and bound algorithm, the basic ideas of 
which were originally introduced by Talbot [8], [9]. Patterson et al. [10] proposed 
an enumeration procedure of the branch and bound type and Sprecher [11] accel- 
erated and generalized the algorithm. A slightly modified version of this algorithm 
has been proposed by Schultmaim [12]. The algorithm generates exact solutions as 
far as regular measures of performance (cf [11]) are considered. An example of a 
regular measure of performance is the objective function (15). 

As a result of the solution procedure, schedules are generated which represent 
optimistic and pessimistic time assessment on the given a-levels. Thus, the result 
is a minimal project duration for each of the six crisp problems (14)... (18) which 
represent the fuzzy interval of the minimal project duration (Tab. 1). 





307 



Table 1. Solutions for the crisp scheduling problems (14).. .(18) 





1 


2 


3 


4 


5 


6 


k 


0 


0 


0 


P 


P 


P 


a 


€ 


A 


1 


1 


A 


£ 


makespan 




O 

o 




<!>*' 

P 


O 

p 


p 



4 Conclusion 

The model presented has already been successfully applied to the dismantling of 
buildings and discarded electronic products. Nevertheless, concentrating on proc- 
essing times which are modeled as fuzzy intervals the model only covers a selec- 
tion of some uncertainties decision makers are confronted with in practice. Ac- 
cordingly, further uncertainties like fuzzy due dates, fuzzy capacity constraints, 
uncertain composition of discarded products or fuzzy precedence relations have to 
be considered and incorporated in the model. Despite of all dimensions still to be 
considered, the most important obstacle fuzzy dismantling planning in practice is 
the generation of adequate data for fuzzy dismantling planning in practice. 

References 

[1] Brucker, P., Drexl, A., Mohring, R., Neumann, K. and Pesch, E. (1999) Resource- 
constrained project scheduling: Notation, classification, models, and methods. Euro- 
pean Journal of Operational Research 112(1), 3-41 

[2] Schultmann, F., Rentz, O (2001) Environment-oriented project scheduling for the dis- 
mantling of buildings. OR Spektrum 23 (1), 51-78 

[3] Rommelfanger, H. (1994) Fuzzy Decision Support Systeme. Entscheiden bei Unsch^- 
fe, 2. Auflage. Springer, Berlin, Heidelberg, New York (in German) 

[4] Hapke, M., Slowinski, R. (1996) Fuzzy priority heuristics for project scheduling. 
Fuzzy Sets and Systems 83, 291-299 

[5] Hapke, M., Slowinski, R. (2000) Fuzzy Set Approach to Multi-Objective and Multi- 
Mode Project Scheduling under Uncertainty. Scheduling under Fuzziness. Slowinski, 
R.; Hapke, M. (eds.). Springer, Berlin, Heidelberg, 197-221 

[6] Pritsker, A.A.B., Watters, L.J., Wolfe, P.M. (1969) Multiproject scheduling with lim- 
ited resources: A zero-one programming approach. Management Science 16, 93-108 

[7] Hapke, M., Jaszkiewicz, A., Slowinski, R. (1994) Fuzzy project scheduling system for 
software development. Fuzzy Sets and Systems 21, 101-117 

[8] Talbot, F. B., Patterson, J. H. (1978) An efficient integer programming algorithm with 
network cuts for solving resource-constrained scheduling problems. Management Sci- 
ence 24, 1163-1174 

[9] Talbot, F. B. (1982) Resource-constrained project scheduling with time-resource 
tradeoffs: The nonpreemptive case. Management Science 28, 1 197-1210 

[10] Patterson, J. H., Slowinski, R., Talbot, F. B., Weglarz, J. (1989) An algorithm for a 
general class of precedence and resource constrained scheduling problems. Advances 
in Project Scheduling. Slowinski, R. and Weglarz, J. (eds.). Elsevier, Amsterdam, 3-28 

[11] Sprecher, A. (1994) Resource-Constrained Project Scheduling - Exact Methods for the 
Multi-Mode Case. Springer, Berlin, Heidelberg, New York 

[12] Schultmann, F. (1998) Kreislauffuhrung von Baustoffen - Stoffflussbasiertes Projekt- 
management fur die operative Demontage- und Recyclingplanung von Gebauden. 
Erich Schmidt, Berlin (in German) 




Group Decision Making Versus Expert Opinion in 
the Muiti-Objective Analysis of Ecosystem 
Management' 



Lidija Zadnik Stim 

University of Ljubljana, Biotechnical Faculty, VeCna pot 83, 1000 Ljubljana, Slo- 
venia, e-mail: lidija.zadnik@uni-lj.si 



1 INTRODUCTION 

Ecological issues with demands for preserving the nature and multipurpose use of 
land have, along with existing economic criteria, become a key part of the modem 
concept of ecosystem management. Therefore, on one hand, the land owners and 
experts are faced with the land use decisions which maximize the profit and refer 
to ecological objectives, while on the other hand, the public, who benefits from the 
amenity value of the ecosystem, specifically derives its own scenario of decisions. 
As such, an ecosystem management problem is a satisfactory attainment of multi- 
ple, but conflicting, objectives (Zadnik 2001, 2002). 

This problem has precipitated a decision support model to determine the multi- 
objective ecosystem management decisions under achieving the goals from both, 
owners/experts and public aspects. In the model, the management process is de- 
fined in terms of decisions, constraints and objectives. The state of treated ecosys- 
tem is described in terms of parameters Si, S 2 , ... Ss, such as area of arable land, 
forest land, products, labor force, machinery, financial resources, ecological con- 
ditions, etc. The parameters Si, S 2 , ... Ss form a state vector x(j)"^ x(si,...,Ss)gX, 
where X is the set of all possible state vectors of the treated ecosystem at the con- 
sidered time. We assume that there is a finite number of such vectors xG) 
treated time. Furthermore, a considerable attention is paid to the determination of 
the goal state of the ecosystem x(j*)=x*(si*, S 2 *, ... , Ss*) which is designed to 
meet demands for sustainable, pro-natural, environment-friendly and socio- 
economically sound decisions. 

The decision maker can influence the virtual ecosystem by invoking manage- 
ment decisions, expressed by decision variables d(m,xG)) ^ ^(xG))? m=l,2,...,M. 
For each state vector 2 ^) there exists a finite discrete set of decisions D(xG))- The 
decisions are mutually exclusive at the treated time and its accompanying state. 
The set of feasible decisions is constrained by area limitations, budget constraints. 



1 



The pr^ent paper is part of an EU financed research on ‘Tools for evaluating investment in the Mediterranean mountain areas - An in- 



tegrated framework for sustainable development - MEDMONT (QLK5 - CT-2000-01031)’ 




310 



available machinery and labor hours, environmental constraints, etc. The decision 
variables d(m,x(j)) move the ecosystem from a state xG) to another state x’Q’) at 
the next time moment (Eq. 1.1) which should be closer to the goal state xG*)- 

x’G’)=h(xG),d(m,xG))) (1.1) 

The transformation function h is defined empirically for each ecosystem. 

The objective function Z(xG) 5 d(ni,xG))) with components z(k,xG)>d(m,xG)))5 
k=l,2,...,K, such as net revenue from production, recreational opportunities, sus- 
tainability, etc., which reflect managerial objectives and link state and decision 
variables is to be optimized: 

K (1 2) 

max and /or min X z(k, 2 ^( j), d(m 2 ^( j))) 
d(m,x(j)) ^ = 

subject to Eq. (1.1), for m=l,2...,M. 

The objective functions are first evaluated from a single decision maker 
(owner/expert) point of view. For each decision, the objectives are given through 
the multidimensional objective function, viewed as a separable and additive de- 
composition of the linear utility function which is expressed by indicators 
(Winston 1994). As the objectives are of different importance to decision makers, 
weights of objectives are generated (Robinson et al. 1991). Further, the public par- 
ticipation in decision making process is expressed through a group decision mak- 
ing under multiple objectives. An ordinal agreed criteria approach (Hwang and 
Lin 1987) and conjoint analysis (Hair et al. 1998) are used. 

Finally, knowing the effects of ecosystem management decisions on both, own- 
ers/experts and public, the analyst incorporates their opinion into the decision 
model to determine the best compromise decision (Zeleny 1984; Ballestero and 
Romero 1998). A numerical example is presented to illustrate the problem, model 
and methods. 



2 THE WEIGHTS OF OBJECTIVES AND THE UTILITY 
FUNCTION 

Because the objectives z(k,xG)>d(ni,xG))) multiple, conflictive and of different 
importance (value) to the decision makers, the weights w^ for each objective have 
to be determined. The method used to determine the preferences of objectives to 
the decision makers is Saaty's AHP method (Golden et al. 1989; Winston 1994). 
For the pairwise comparisons of one objective over another a scale with values 1 
to 5 is used. The relative comparisons ri/rj of objective i over objective j are gath- 
ered in a matrix W. The weights w^ are to be obtained by an iterative procedure 
(Winston 1994) of calculating the sequence of matrices: W^, (W^)^, 
which is followed by summing the rows. Finally, the columns are normalized. Let 
us illustrate the calculation of weights with an example of 3 decisions, and 3 ob- 
jectives: investment (I), sustainability (S) and profit (P). Supposing that experts 
expressed the relative importance of one objective over another by the matrix W, 





311 



the most important is sustainability (w2=0.56), the second investment (wi=0.32) 
and the least important is profit (w3=0.12). 

I S P 



I 


■ 1 


3 


4 






Wj =0.32 
=0.56 


w= s . 


1/3 


1 


3 


W2, (W2)2, ... 


... 


p 


1/4 


1/3 


1 






=0.12 



Furthermore, it is assumed that the objective functions z(k,x(j),d(m,x(j))) for 
k=l,2,...,K are determined by attributes ak(x(j),d(m,x(j))), each having a finite 
number of possible values called levels. Let ak,i (x(j)4(m,x0))) the i-th level 
(i=l,2,...,I) of the k-th attribute (k=l,2,...,K) associated with the state x(j) and de- 
cision d(m,x(j))- The vector objective function Z(x(j),d(m,x(j))) is expressed as the 
decision maker’s multiattribute utility (value) fiinction: 

v(a,,, ,aj = v(a, .(xO'),d(m,xO))), ajx(j)>d(m,x0’))), ... ,ajx(j),d(m,x(j)))). 

Under certain conditions, however, the assessment of v(ai i,....,ak,i, ,aic,i) can 

be greatly simplified (Winston 1994). This is the case, if the operator of multi- 
criteria control is additively separable with respect to the space of inputs and con- 
trol space. The utility function v(ai i,....,ak,i, ,aK,i) is an additive function if there 

exist K functions Vi(ai,i), .... , Vk(ak,0, , VK(aK,0 satisfying Eq. (2.1): 

K 

v(a -,a a )= E wnYuCak-;) (2.1) 

l,i k,i K,i k=l ^ 

where Wk present the weights of attributes ak; in our case determined through the 
matrix W. If the attribute ak is a numerical variable (known with certainty), and 
the decision maker is risk neutral, and the utility function for a single attribute is 
linear, the utility function v(ak,0 is given as Eq. (2.2): 

v(ak,i)~(ak,rai v/orst)/ (^i,besr^i,worst) (2*^) 

where v(ak,i=ai,worst)='0 and v(ai i=ai,best)=l> as illustrated in the second and the third 
line of Table 1 which considers three decisions di, d2, d3 and three objectives (sus- 
tainability measured by attribute ai, investment measured by attribute a2 and profit 
measured by attribute a3) and shows the utility functions v(ak,i) calculated by the 
use of Eq. (2.2). If we calculate v(ak,0 for oach decision and use the normalization 
for the purpose of commensurability, we get the normalized values in Table 1. 
Taking into account the weights Wk, calculated for 3 objectives i.e., Wi=0.32, 
W2=0.56 and W3=0.12, and using Eq. (2.1), we obtain for the utility fimctions of 
decisions di, d2, d3 the following values: di:0.042; d2:0.720; d3:0.544. The normal- 
ized values are: di:0.03; d2:0.55; d3:0.42. 

The results show that regarding the experts /owners preferences and the chosen 
utility functions the most preferred is decision 2, the second is decision 3 and the 
least preferred is decision 1 . 



Table 1. The utility function for the attributes for 3 decisions and 3 objectives 



Att. 


n 


mm 


H 


v(ak.i) 


normalized 

v(a,,)ford, 


normalized 

v(a,0ford. 


normalized 

v(a.,)ford, 


a, 


a 


b 


C 


■HB 


0 


0.33 


0.67 


a. 


mm 


5 


2 


^ 


0 


0.72 


0.28 




wm 




■a 




0.26 


0 


0.74 
















312 



3 AGREED ORDINAL APPROACH AND CONJOINT 
ANALYSIS 

Let us assume that in Eq. (1.2) subject to Eq. (1.1) we deal with m decisions di, 
(i=l,...m) which should be evaluated by n decision makers, members of public, 
residents, or groups of residents (k=l,2,...,n) who are using for the evaluation 
p attributes aj (j=l,2,...,p). For each decision maker who evaluates the deci- 
sions di by ranking, rating or scoring a matrix A^ is generated. In the continuation 
we will consider only ranking, i.e., the ordinal approach will be used (Hwang and 
Lin 1987). Furthermore, the agreed criteria approach, which involves each deci- 
sion maker using the same criteria, is considered, to find the matrices A^. Starting 
from the matrices A^, we calculate for each criterion a matrix Bj (j=l,...,p). Its 
elements are the ranks of decisions di given by the decision makers Ak. By sum- 
ming up the elements in each row of matrices Bj, a matrix A’ is generated. A’ pro- 
vides the ranks for all decisions di under all criteria aj given by decision makers 
Ak. However, it is possible that some criteria are more important than the others. 
Therefore, decision makers would want to place more weight on that criterion. To 
accomplish this, we will use a vector of weights W=(wi,...,Wp) obtained by the 
AHP method. Then we formulate an agreement matrix G which entries are gir, 
given by Eq. (3.1): 

(3.1) 

gir=X^irjWj 

j=l 

where ai^ = 1 if the decision di is for the criterion (attribute) aj placed to the r-th 
position, otherwise ai^ = 0. 

Furthermore, we want to match the decision di with the rank r so that the sum of 
the corresponding assigned weight value is the largest possible. This task can be 
achieved by solving assignment problem of linear programming given by Eq. 
(3.2): 

- - (3.2) 

maxi; IgirXir 
i=1 r=1 

subject to = 1, r = and = IJ = wherex^,. = 1, // r is assignedto d ^ , otherwise it is 0. 

«=1 /•=! 

With the example for 3 decisions (di, d 2 , da), three objectives (ai, a 2 , a 3 ) and 
five decision makers Ai,...,As, given the preferences by matrices Ai,..,,A 5 , the 
following results are calculated: di as the most preferred, d 2 as the second pre- 
ferred and ds as the least preferred, which ranks on the last place. For the com- 
promised solution we would need the normalized values given by decision mak- 
ers. Taking into account the normalized values, di is the most preferred. Let us 
assign to it 3 points, to d 2 2 points and to 1 point. Normalizing the points, we 
obtain for decisions the following normalized values: di:0.50; d2:0.34; d3:0.16. 

a, 32 % 

d, f3 3 1 
A, =d2 2 2 3 

dah 1 2 



a, a2 a3 a, 82 33 





■3 1 2 




■3 3 3' 


Aa =d2 


1 3 1 


^3 ~ ^2 


1 1 1 


d3 


2 2 3 


da 


2 2 2 



^2 33 a, 82 83 



d, 


'2 1 1' 


d, 


■3 3 3 


A4=d2 


1 3 3 


Aj =d2 


2 2 1 


da 


3 2 2 


d3 


1 1 2 





313 



Furthermore, to assess the public preferences of the decisions the attributes and 
their levels have to be chosen. We assume three attributes (factors), each at two 
levels, low (L) and high (H): ecological factor, factor of investment and factor of 
profit. Using the full profile method and assuming that an additive rule is appro- 
priate, we were able to use factorial design which enabled the evaluation of all 8 
stimuli: LLL (low ecology, low investment and low profit), LLH, LHL, LHH, 
HLL, HHL, HHH, HHL. The conjoint analysis experiment was administered dur- 
ing the personal interview with 24 respondents ranging stimuli with 1 as being the 
best, while 8 assigned as the worst. The survey results are as follows: respondent 
1:653 24 1 8 7; ; respondent 24: 6 3 4 2 5 1 8 7. 

Using the SPSS subprogram utility (Norusis 1997) the estimation of part- worth 
(utility (s.e.)) and the relative importance (importance in %) of each attribute are 
calculated for each respondent separately, and then aggregated to obtain the total 
worth (TW), using the additive rule (Backhaus et al. 1994) in Eq. (3.3): 

TW=const. part- worth of 1 . fac.+part- worth of 2. fac.+part worth of 3.fac. (3.3) 
The overall utilities (preferences) for all 24 respondents are: 



STIMULI 


l(LLL) 


2(LLH) 


3(LHL) 


4(LHH) 


5(HLL) 


6(HHL) 


7(HHH) 


8(HLH) 


UTILITY 


5.7917 


5.8125 


5.7917 


5.8125 


5.7917 


3.1875 


3.1875 


3.1875 



We are interested only in the stimuli HLL, LHL and LLH which represent the 
decisions di, d 2 , respectively. The results reveal that the respondents preference 
is the highest for da, while di and d 2 are for them of equal importance. 



4 DETERMINATION OF AN IDEAL COMPROMISED 
DECISION BY A COMPROMISE PROGRAMMING METHOD 



Compromise programming is explained with Eq. (4.1), (Steuer 1986, Tecle et al. 
1994, Taha 1997, Ballestero and Romero 1998): 



Lsi =(X'^j 



|S 



^ jbest ^ ji 
^ jbest ^ jworst 



(4.1) 



where: L^i is a measure of distance from the ideal decision di, wj is the weight for 
objective aj, xjj is the attribute level for objective aj and decision dj,, Xjbest is the best 
value for objective aj, i.e. maximum for benefit criteria and minimum for cost cri- 
teria, and Xjworst respectively. 



Let us assume in our numerical case 3 decisions di, d 2 , d 3 , evaluated by 3 crite- 
ria: ai by public by agreed criteria, a 2 by experts by utility function, and as by pub- 
lic, owners, experts by conjoint analysis. If we further ascribe to each criterion aj 
in dependence of decision di a linear function f(aji) in the sense of function given 
by Eq. (2.2), and use Eq. (4.1), we get Table 2: 





314 



Table 2. The evaluation of 3 decisions by experts, owners and public 



Criterion 


n 


H 


da 


normalized 

v(a,.)ford, 


normalized 
v(a^3 ford. 


normalized 

v(a,,)ford. 


Agreed (a,) 


BfSl 


0.34 


0.16 


0 


0.32 


0.67 






■IIWI 


0.42 


0.80 


0 


0.20 




Oi 


KB 


0 


0.50 


0.50 


0 



Taking into account the equal weights Wi=l, W 2 =l and W 3 =l, and minimizing 
the expression in Eq (4.1), we can obtain the results: di:1.3, d2:0.82, dsiO.SS. The 
results show that the most preferred is di, the second is and the least preferred is 

d2. 



REFERENCES 

Backhaus K, Erichson B, Plinke W, Weiber R, (1994) Multivariate Analysemethoden. 
Springer Verlag, Berlin 

Ballestero E, Romero C (1998) Multiple criteria decision making and its applications to 
economic problems. Kluwer, Boston 

Golden B, Wasil E, Harkey P (1989) Analytic hierarchy process. Springer, Heidelberg 
Hair JF, Anderson RE, Tatham RL, Black WC (1998) Multivariate data analysis. MacMil- 
lan Pub. Co., New York 

Hwang CL, Lin MJ (1987) Group decision making under multiple criteria. Springer, Berlin 
Norusis MJ (1997) SPSS for Windows: Advanced Statistics, Ver. 10. SPSS Inc., Chicago. 
Robinson IP, Shaver PR, Wrightsman LS (1991) Measures of personality and social psy- 
chological attitudes. Academic Press, San Diego 
Steuer RE (1986) Multiple criteria optimization. John Wiley, New York 
Taha HA (1997) Operations research. Prentice Hall, New Delhi 

Tecle A, Duckstein L, Korhonen P (1994) Interactive, multiobjective programming. Ap- 
plied mathematics and computation 63: 75-93 
Winston WL (1994) Operations research. Duxbury Press, Bellmont, CA 
Zadnik Stim L (2001) Compromise programming for solving economic and environmental 
problems. In: Oblak L (eds) Ways for improving woodworking industry for transi- 
tional economies. Biotechnical Faculty, Ljubljana, pp 157-162 
Zadnik Stim L (2002) Evaluation of projects regarding the preferences of several decision 
makers (in Slovene). In: Grad J (eds) Proceedings DSI-SOR, Ljubljana, pp 366-371 
Zeleny M (1982) Multicriteria decision making. McGraw-Hill, New York 



















Capital Market Efficiency - An Empirical Analysis 
of the Dividend Announcement Effect for the 
Austrian Stock Market 



Roland Mestel', Henryk Gurgul^, Christoph Schleicher^ 

’ Department of Banking and Finance, University of Graz, Austria; 
roland.mestel@uni-graz.at 

^ Department of Applied Mathematics, University of Mining and Metallurgy, 
Krakow, Poland; pgurgul@go2.pl 

^ Department of Economics and Institute of Applied Mathematics, University of 
British Columbia, Vancouver, Canada; chrschle@interchange.ubc.ca 



Summary: This study investigates the effects of dividend announcements using 
data from the Austrian stock market. Abnormal returns are established as the dif- 
ference between actual returns and predicted returns generated from time series 
models. We use the model of expected dividends, where expectations are based on 
prior dividends. Our results provide evidence that announced dividend changes 
convey new information to the market as stock prices move in the same direction 
as dividends. In addition, the speed of stock price reaction to the new information 
provides a test of the semistrong form of the efficient capital market hypothesis. 



1 Introduction 

It is widely accepted in financial theory that due to market imperfections a com- 
pany’s dividend policy has an impact on the wealth of its shareholders^ At the 
core of the arguments lies the fact that dividend changes convey valuable informa- 
tion of the better-informed management to the market. Stock price reactions on 
announced dividend changes therefore reflect information asymmetries between 
the firm and its investors. 

Nevertheless it is still a matter of debate what exactly the information content 
of changing dividends is. Many authors argue that the firm’s decision-makers, via 
dividends, convey their inside information about current and/or future cash flows 
to the investors {cash flow signaling hypothesis). Following this argiunent divi- 
dend increases (decreases) serve as signals of increased (decreased) current and/or 
prospective future earnings. 

An alternative body of research rests upon the free cash flow hypothesis. Free 
cash flow is the section of cash flow that is left after all investments having posi- 



^ The authors would like to thank Christian Gutlederer (Reuters Austria) and 
Peter Ladreiter (Capital Bank) for supplying data and Peter Steiner (University of 
Graz) and the anonymous referee for helpful comments. 




316 



tive net present value (when cash flows are discounted at the appropriate rate) are 
realized. If not distributed as dividends, managers tend to invest these free cash 
flows below the cost of capital, or waste them on inefficient organizational expem 
diture. From this Lang and Litzenberger (1989) conduct that an increase in divi- 
dends is primarily a signal that managers are paying out the free cash flows to the 
shareholders rather than investing them in projects with negative net present value, 
thereby reducing agency costs. 

Both theories predict a positive correlation between announced dividend 
changes and stock prices. This effect has been confirmed empirically by several 
studies, mainly for the US market (e.g. Aharony and Swary (1980) or Best and 
Best (2001); for the German stock market the study by Amihud and Murgia 
(1997) has been the most rigorous). For the Austrian stock market our study is the 
first to test the dividend announcement effect on stock prices. 

As dividend announcements have an impact on shareholders’ wealth the man- 
ner in which this effect is incorporated in stock prices through time also provides a 
test of the semistrong form of the efficient capital market hypothesis. This hy- 
pothesis states that stock prices, at any time, reflect all publicly available informa- 
tion relevant to the valuation of the firm. Therefore, if an announced change in 
dividends provides new information to the market (i.e. we assume no insider trad- 
ing), the speed of price reaction supplies information on the degree of market effi- 
ciency. The faster the prices adjust to the news the more efficient the capital mar- 
ket. 

The remainder of the paper is organized as follows: Section 2 describes our 
data; in Section 3 we analyze our methodology to determine abnormal (excess) re- 
turns; in Section 4 we present our results; Section 5 contains a discussion and con- 
clusion. 



2 Data 

Our sample contains 22 companies listed on the Austrian stock market. The com- 
panies have been quoted on the Austrian Traded Index (ATX) between January 
1992 and April 2002, although not all firms have been listed on the stock market 
for the whole period. 

For these firms all daily close-prices are derived from Datastream or the Vi- 
enna Stock Exchange. For the period under consideration we filtered 176 dividend 
annoimcements from several thousands included in the Dow Jones and Reuters 
Factiva database. We assume all announced dividend changes to be unexpected. 
This can be justified by the reluctance-to-change dividends assertion which states 
that managers do not change dividend payments unless they have reasons to ex- 
pect a significant change in the future prospects of the firm (Aharony and Swary 
1980). From the observed announcements 74 were classified as announced divi- 
dend increases, 75 as constant dividends and 27 as dividend reductions. 

We define the announcement (event) date as the very first day official state- 
ments on dividends by the executive board of the analyzed firm can be identified 
in the Factiva database. In many cases these announcements only contain informa- 
tion on the expected direction of dividend changes (increase; constant; decrease). 





317 



but not on its exact levels. Neither the ex-dividend day nor the day the dividend is 
paid is considered to be an announcement day. 



3 Methodology 

To establish the impact of dividend announcements on stock prices for Austrian 
firms, we conduct an event study analysis based on selected time-series models. 
This is different from most related studies that use simple linear regressions (based 
on the market model by William Sharpe) to forecast stock returns. As the assump- 
tions of the simple linear regression model are often violated this leads to biased 
estimators of parameters and biased forecasts of stock returns. 

From daily stock prices at close we first calculate log-returns 

Ri, = ln\^\, (3.1) 

V‘-‘ J 

where Pa stands for the stock price of company i on date t, Pij-j denotes the stock 
price of company i on date t-1, In denotes the natural logarithm. 

For each event (dividend announcement) in the sample we define an event- 
window and a pre-event window. The event window comprises 5 trading days, the 
announcement date (t = 0) plus the two days before (t = -2, t = -1) and the two 
days after the announcement (^ = +1, ^ = +2). The pre-event window covers the 50 
trading days prior to the event window. For each day of the event window we 
compute the abnormal return AR as the difference between the actual ex-post re- 
turn and the security’s normal return that is predicted in the absence of the event. 
Formally, for each announcement of the analyzed companies we computed 

ARu-Ru-mJ^, (3.2) 

where stands for the actual return of firm i on date t in the event window and 
1 1 X] denotes the predicted return conditional on the information set X, where 
X=(i^„_52,...A-3). 

To predict returns in the event window we used Box and Jenkins’ ARIMA 
methodology. To illustrate the model identification we consider the general 
ARIMA(p,^/,^) model for a time series Zt 

(3.3) 

where . ~6gB^ , Bzt'=Zt-i, B is called back- 

space operator, and at is white noise (error term). Model identification refers to the 
methodology in identifying the required size of sample (w = 50 is recommended), 
proper orders of p, d and q, variance stabilizing transformations, the decision to 
include a deterministic parameter 6 ^ when d>\. 

We first examined the plots of the individual log-returns in the pre-event win- 
dows in order to get a feel for the overall properties (trends, outliers, constancy of 
variance, stationarity). Our conjecture was that the time series are either integrated 
of order zero or one. 




318 



Next, we identify the orders p and q by matching the patterns in the sample 
ACF (autocorrelation function) and PACF (partial autocorrelation fimction) with 
the theoretical patterns of known models summarized in table 1. 



Table 1. Characteristics of theoretical ACF and PACF for stationary time series 



Process 


ACF 


PACF 


ARO) 


Tails off as exponential 
decay or damped sine wave 


Cuts off after lag p 


MA(^) 


Cuts off after lag q 


Tails off as exponential 
decay or damped sine wave 


ARMA(p,^) 


Tails off after lag (q-p) 


Tails off after lag (p-q) 



When ACF and PACF are not significant and in the absence of a deterministic 
trend we model the time series as white noise (i.e. ARIMA(0,0,0)). 

In the random walk model, i.e. ARIMA (0,1,0), the value of z at time t is equal 
to the value at time (^-1) augmented by a random shock. In empirical forecasting 
it happens very often that the hypothesis of ”no change” is the most likely scenario 
(martingale hypothesis). Therefore the random walk model plays a major role in 
the empirical analysis of financial time series, also in the modeling of stock prices. 
If the random walk hypothesis for stock prices is true it can be expected that log- 
returns would be approximately white noise. Note that the random walk model is 
the limiting case as ^->1 in the AR(1) model. 

In addition we also applied the Akaike information criterion in order to com- 
pare parsimoniously nested models. 

Taking into account these guidelines we estimated four different models for the 
log-return series in the pre-event window: white noise, AR(1) (with ^ close to 1), 
MA(1) (only in 1 case) and linear trend model. All estimated parameters are sig- 
nificant at the 1% level. Most log-return series from our data sample can be mod- 
eled approximately as white noise. The goodness of fit was evaluated by inspec- 
tion of the residuals (tests for mean, autocorrelation and homoscedasticity). We 
then use the time series models to predict returns (the ARIMA forecasts in the 
event window) and abnormal returns (the forecast errors). 

Next we form three clusters, one for announced dividend increases, one for de- 
creases and one for constant dividends. For each cluster, we compute average ab- 
normal returns AR^ across sample members for day t. Finally, we test the null hy- 
pothesis that the mean abnormal return on day t of the event window is equal to 
zero. The test statistic is the ratio of the (cross-sectional) mean abnormal return in 
the event window and the (cross-sectional - time series) standard deviation of 
mean abnormal returns in the pre-event window. 

Assuming that the AR^ are identically, independently and normally distributed, 

under the null hypothesis /stat has a Student-/ distribution with (A^-1) degrees of 
freedom. Although daily excess returns are in general non-normal, by a standard 
central limit theorem (CLT) the cross sectional mean excess return converges to 
normality as the number of sample securities increases. Main empirical results are 
given in the next section. 





319 



4 Results 

Table 2 summarizes our results for abnormal returns within the event window for 
each cluster. In order to prove significance by /-Student test we check the hy- 
potheses that the time series of mean abnormal returns in clusters considered are 
normally distributed. Using the Chi-square goodness-of-fit statistic, the Shapiro- 
Wilks W statistic and tests based on skewness and kurtosis, we cannot reject the 
mentioned hypotheses (p-value for all tests is greater than 0.05). Furthermore 
autocorrelations are not significant at the 5% level. 



Table 2. Average daily abnormal returns for the event window in three clusters 





Dividend increases 


Constant dividends 


Dividend decreases 




Sample size: 74 


Sample size: 75 


Sample size; 27 


Event period 
day t 


AR (%) 


/stat 


AR (%) 


/stat 


AR (%) 


/stat 


-2 


+0.454 


2.093 


-0.196 


-1.103 


-0.178 


-0.390 


-1 


+0.268 


\.231 


-0.202 


-1.048 


-0.276 


-0.606 


0 


+0.617* 


2.847 


+0.002 


0.115 


-1.355* 


-2.91 \ 


+1 


-0.030 


-0.136 


+0.073 


0.365 


-0.119 


-0.262 


+2 


+0.497 


2.292 


-0.138 


-0.799 


-0.089 


-0.196 


X 


+ 1.806 


3.540 


-0.461 


-1.230 


-2.017 


-2.290 



* significant at the 1 % level 



For the 74 dividend increases the average abnormal return was 0.62% (signifi- 
cant at the 1% level) on the announcement day. 

In the case of constant dividends (sample size: 75), the average abnormal re- 
turns are not statistically different from 0 on any day within the event window. 
This supports the hypothesis that companies that leave their dividends unchanged 
communicate no significant new information to the market. 

In our sample (size: 27) of announced dividend decreases we find a statistically 
significant average abnormal daily return of -1.35% on the announcement day. 
This result supports the empirical findings for other markets that a cut in dividend 
payments conveys negative information to the public. 

In comparison to the abnormal returns induced by increasing dividends the re- 
ported negative return due to an announced contraction in dividends is much 
higher in absolute terms. This confirms the general observation on financial mar- 
kets that bad news has a greater impact on stock returns than good news. In the 
sense of cash-flow signaling theory this is an indication that analysts revise their 
forecasts for future earnings of companies much more strongly in the case of divi- 
dend decreases than increases. 

One can see that also CAR (cumulative abnormal return) is in absolute terms 
highest (2.02%, significant at the 5% level) in the case of dividend decreases. This 
case is followed by dividend increases (1.81%, significant at the 1% level). Bad 
news causes greater volatility than good news. Therefore /-values are smaller in 
the first case than in the second. In the case of constant dividend CAR is negative 
and not significant. 

The fact that significant abnormal returns can be observed only on the immedi- 
ate day dividend changes are announced implies important findings: firstly, the 





320 



Austrian stock market seems to have a high degree of semistrong market effi- 
ciency. This means that new information is incorporated into stock prices rather 
quickly, at least within the same day the news is conveyed to the market. We also 
find that the reaction of market prices is unbiased. The initial reaction reflects the 
true implications of the information on the values of the securities as there are no 
subsequent corrections of the initial reactions on day t = +l or / = +2. Finally, as 
the returns on day / = -2 and / = -1 are not significant, there is no indication of in- 
side trading prior to the release of new information. 



5 Conclusions 

The findings of this research show that dividend policy is an important source of 
information for investors. Announced dividend increases induce a significant posi- 
tive reaction in stock prices, whereas announced decreases in dividends lead to a 
significant fall in stock prices. Constant dividends leave stock prices unaltered. 

Our results for the Austrian stock market also support the semistrong form of 
the market efficiency hypothesis. Excess returns can be observed only on the an- 
nouncement day; there are no subsequent price reactions after that day. 

Finally we find little indication of leakage of information prior to the an- 
nouncement day. This supports the hypothesis of no inside trading on the Austrian 
stock market prior to dividend announcements. 



References 

Aharony J, Swary I (1980) Quarterly dividend and earnings announcements and stockhold- 
ers returns: An empirical analysis. Journal of Finance 35 (1): 1-12 
Amihud Y, Murgia M (1997) Dividends, taxes, and signaling: evidence from Germany. 
Journal of Finance 52 (1): 397-408 

Best RJ, Best RW (2001) Prior information and the market reaction to dividend changes. 

Review of Quantitative Finance and Accounting 17: 361-376 
Lang LHP, Litzenberger, RH (1989) Dividend announcements: Cash flow signalling vs. 
free cash flow hypothesis. Journal of Financial Economics 24: 181-191 





On Tail index Estimation and Financial Risk 
Management Implications 



Niklas Wagner 

Department of Business and Economics, Munich University of Technology 

D-80290 Munich, Germany 

niklas.wagner@wi.tum.de 



Summary Estimation bias is a critical issue when drawing inferences about the 
tails of the return distribution of risky assets. Risk management applications which 
rely on methods of extreme value theory must consider the statistical properties of 
the tail index estimator used; see for example recent simulation studies by Gomes 
and Oliveira (2001), Matthys and Beirlant (2000), and Wagner and Marsh (2000). 
The present contribution outlines potential effects of bias on quantile estimation 
thereby considering error sensitivities within the widespread Value-at-Risk- 
approach. The results show that particularly inference far out in the distribution 
tails is sensitive to bias. The paper further gives an overview of recent literature 
documenting small sample bias in tail index estimation and points out some new 
approaches aiming at its reduction. 



1 Introduction 

Financial risk management is particularly concerned with the extreme changes in 
the value of portfolios of risky assets. Depending on the t3q?e of risk under consid- 
eration, a financial institution may distinguish market, credit and operational risks. 
Risk management focuses on the distribution of losses where a widespread ap- 
proach is based on modelling quantiles of the loss distribution also denoted as 
Value-at-Risk (VaR); for discussions see for example Jorion (1997), Oehler 
(1998), and Ridder (1998). 

Extreme value statistics provide a theory for modelling extremal changes which 
typically are of main interest in risk management applications. As outlined for ex- 
ample in Beirlant et al. (1996), Coles (2001), Embrechts et al. (1997), and Kliip- 
pelberg (2002), extreme value theory characterizes the asymptotic tail behaviour 
of some unknown, distribution function. It thereby establishes three classes of lim- 
iting distributions for normalized sample maxima. An important parameter of 
these limiting distributions is the so-called “tail index” frequently denoted by a. 

Under the quite general assumption that the distribution function is heavy- 
tailed, i.e. when it belongs to the maximum domain of attraction of the Frechet 
limiting distribution (see e.g. Embrechts et al. 1997), a commonly used estimator 
is based on a proposal by Hill (1975): 




322 



a =j'^lnX,^-\aX,^ , (1.1) 

The above estimator is based on k upper order statistics z = 1, k, with > 
X.^xj from T sample observations having a common continuous distribution func- 
tion. It is constructed as a maximum likelihood estimator conditional on some 
known threshold level u = y- 

Given that k approaches infinity with the sample size T, but k remains a small 
portion of the overall sample, one can show that the conditional maximum likeli- 
hood estimator is asymptotically normally distributed: 

■Jk{a-a)UN{b,a^). (1.2) 

Note that the distributional limit in (1.2) does not contain an asymptotic bias term 
as the threshold u is assumed to be known, i.e. one can write = 0. 

In applications however, the threshold u is unknown and bias \b\ > 0 enters. Es- 
timation of the threshold becomes essential to the Hill estimator’s bias/variance 
trade-off: Selecting too many potential tail observations reduces variance but in- 
troduces bias and vice versa. In general, the estimator’s optimality properties will 
therefore be weak and estimation performance is measured by mean squared esti- 
mation error. Studies which simulate the small sample properties of the Hill esti- 
mator and point out estimation bias under various distributional assumptions in- 
clude Gomes and Oliveira (2001), Matthys and Beirlant (2000), and Wagner and 
Marsh (2000), for example. 

Applications of extreme value statistics in the empirical finance literature typi- 
cally assume that the returns of some risky asset are drawn from a given common 
distribution function. Previous studies which apply the Hill estimation approach 
include for example Danielsson and de Vries (1997), Kr^er and Runde (1996), 
Lux (2001), and Wagner (2002b). A different estimation approach is based on fit- 
ting a so-called “Generalized Pareto Distribution” (GPD). Related studies include 
Bassi et al. (1998), Emmer et al. (1998), and McNeil (1998) as well as Lauridsen 
(2000) and Frey and McNeil (2000) who, instead of using raw returns which ex- 
hibit time-series dependence, model conditional returns and assume that the model 
innovations are iid draws from some common distribution function. 

In any case, tail index estimation is based on maximum likelihood jointly to- 
gether with asymptotic arguments from extreme value theory. The choice of a 
threshold which yields a sample fraction of largest (smallest) observation is a po- 
tential source of bias in any application irrespective of the estimation methodology 
chosen. Given the potential impact of bias on practical risk assessment, this con- 
tribution provides a sensitivity analysis of percentage biases in risk assessment 
based on the standard VaR-methodology (Section 2). Section 3 points out new 
methodological approaches facing the bias problem and gives a brief subject out- 
look on management implications. 




323 



2 Risk Assessment under Biases in Taii index Estimation 

Based on the standard Value-at-Risk approach and qunatile estimation, this sec- 
tion outlines the potential effects of biases in Hill estimates of the extremal index 
on estimates of VaR. For the methods see also e.g. Embrechts et al (1997), Ridder 
(1998), and Coles (2001). 



2.1 The Model 



We start with a simple discrete time model for the prices of some risky asset 

S^=S,_,Qxp(RX 0<t<T, 5,>0, (2.1) 

where the random continuously compounded returns are drawn from a common 
distribution function F. In the following we are concerned with losses occurring 
from holding the asset during some given time period. One-period negative-signed 
changes in asset value, i.e. losses - S), are given by model (2.1) as 

A,=-5,[exp(i?,J-l], (2.2) 



where S>0 and hence a loss > 0 occurs if and only if R^^\ < 0. 

Now, given some initial time t asset value > 0, the concept of (conditional) 
Value-at-Risk is concerned with a one-period quantile VaR^ ^ ^. of the gain/loss dis- 
tribution such that 

= (2.3) 



where p is some small probability. Rearranging (2.3) using (2.2) gives an equiva- 
lent condition for the distribution F of the one-period returns R;. 

/>(i?,„<ln[l-(VaR,,,/5, )]) = ;.. 

Applying the generalized inverse F^ of the distribution function F and rearranging 
yields the following expression for the Value-at-Risk 

VaR^,,=4exp(F-(/.))-l]5, (2.4) 



Extreme value statistics can be used to infer the /?-quantile F^(p) of the lower 
tail of the distribution function F. In order to derive an estimate of the tail index a, 
one may choose a sample of T returns, R\,...,R^ from F and apply the Hill estima- 
tor (1.1) to A: upper order statistics resulting from negative return sample order sta- 
tistics: X\ j = -R^ j >X 2 t = ^ ^ “^ 7-^+1 r Selecting the A:+l smallest 

observation from the return sample, R the estimate of the lower tail /7-quantile 
is (see e.g. Embrechts et al. 1997, p 348, for the upper tail quantile): 



F-(P) 



(T 

P I ^T-k,T • 



J 



(2.5) 





324 



2.2 Bias and Risk Assessment impiications 



As is obvious from equations (2.4) and (2.5) in the preceding section, errors or bi- 
ases in the tail index center nonlinearly in the estimation of conditional Value-at- 
Risk according to equation (2.4). Conditioning on unit asset value at time t for 
convenience, the relation is of the form 

VaR^,,.^j(a) = l-exp(rfjd2~‘'“). (2-6) 



where d\ and d 2 are given constants in empirical applications. We can now pro- 
ceed by calculating bias in VaR-estimates under biases in the tail index estimate. 
Of course, while we present our results with the background of estimation bias, 
the analysis is equivalently pointing out the sensitivities of VaR under estimation 
error in general. 

The calculations are done for a set of parameter values which may be consid- 
ered as typical in financial applications: First, the threshold is set to a fixed level 
of a 2 percent loss, d\ = = -0.02, a level below which heavy-tailed Pareto- 

like behaviour can empirically be observed for stock market returns (e.g. Locarek- 
Junge et al. 2002). Next, the subsample fraction is chosen with a fixed level of 2.5 
percent, kIT = 0.025. For the other parameters, we allow a range of choices. The 
probability p is set to typical levels of 0.0001, 0.001, 0.01 and 0.05 which implies 
analysing VaR-estimation-sensitivity under scaling through the constant di = pTIk 
in (2.6). Referring to the simulation results in Wagner and Marsh (2000) who find 
biases of up to roughly 20 percent in the Hill estimator for tail index values of 
typical financial return models, values of h = 0.05, 0.1 and 0.2 are chosen for the 
percentage estimation biases. Finally, all sensitivity calculations are performed for 
typical tail index parameter values of 2, 3 and 4. 

Based on these settings. Table 2.1 reports relative errors corresponding in VaR- 
estimates. Using (2.6), relative errors are calculated as 



±AVaR(or) = 



VaR[g (l±fe)]-VaR(g) 
VaR(a) 



and the numbers in the table are given in percentage terms. Hence, the numbers 
indicate the percentage sensitivities of VaR-estimates under percentage biases in 
the tail index assessment. 



What can we now learn from our numerical results in Table 2. 1 : 

• First of all, a striking observation is that an over-estimation of the tail index a 
always yields an overestimated VaR for the 5 percent probability level. In other 
words, assuming a thinner tail yields a larger estimate of the quantile and hence 
a larger assessment of risk as measured by VaR. This result is clearly 
counterintuitive and points out one of the well-discussed deficiencies of VaR as 
a measure of risk. For all the other probability levels, p < 0.01, consistent with 
intuition, an over-estimation of a yields under-estimated VaR’s as indicated by 
negative signs for the top AVaR(a)’s under each parameter setting. 





325 



Table 2.1. Sensitivity of VaR assessment given by ±AVaR(a) in %. 


P 


a. 

b 


2 


3 


4 


5% 


±5% 


±1.7% 


±1.1% 


±0.8% 






-1.8% 


-1.2% 


-0.9% 




±10% 


±3.2% 


±2.1% 


±1.6% 






-3.8% 


-2.5% 


-1.9% 




±20% 


±5.9% 


±3.9% 


±2.9% 






-8.3% 


-5.6% 


-^.2% 


1% 


±5% 


-2.1% 


-1.4% 


-1.1% 






±2.4% 


±1.6% 


±1.2% 




±10% 


-4.0% 


-2.7% 


-2.0% 






±5.1% 


±3.4% 


±2.5% 




±20% 


-7.2% 


-4.9% 


-3.7% 






±12% 


±7.8% 


±5.8% 


0.1% 


±5% 


-7.0% 


-4.8% 


-3.7% 






±8.4% 


±5.6% 


±4.2% 




±10% 


-13% 


-9.1% 


-6.9% 






±18% 


±12% 


±9.1% 




±20% 


-23% 


-16% 


-12% 






±46% 


±30% 


±22% 


0.01% 


±5% 


-11% 


-7.9% 


-6.1% 






±13% 


±9.5% 


±7.2% 




±10% 


-20% 


-15% 


-11% 






±29% 


±21% 


±16% 




±20% 


-33% 


-25% 


-20% 






±73% 


±53% 


±39% 










326 



• Due to the nonlinearity of the underlying relation and for the parameters cho- 
sen, over-estimation of a gives absolutely smaller percentage deviations in 
VaR then under-estimation. For /? < 0.01 this implies that over-estimating cr re- 
sults in a smaller lack of VaR-capital than the corresponding capital over- 
allocation given due to the under-estimation of a by the same magnitude. 
Hence, while over-estimating a yields failure to protect against losses at the 
given probability level, under-estimating a is particularly costly in terms of 
capital requirements. 

• For the moderate quantiles based onp = 0.05 and /? = 0.01, percentage errors in 
a are rather dampened with respect to their impact on VaR. Assume a typical 
situation where VaR has to be determined at the 1 percent level and the true tail 
index equals three. Then a positive 10 percent deviation in the tail index value 
will cause 2.7 percent under-estimation of the true VaR. A negative 10 percent 
deviation in the tail index value will cause 3.4 percent over-estimation of the 
true VaR. 

• The situation changes dramatically once the probability levels approach the 
area of extreme quantiles, p = 0.001 and p = 0.0001. Additionally, under these 
probability levels, the sensitivities reach enormous values when the true tail in- 
dex becomes smaller approaching a parameter value of two. In the words of 
Embrechts (2000): "‘produce confidence intervals e.g. beyond 99% VaR; a dan- 
gerous job but someone has to do if\ 



3 Outlook 

The preceding section makes clear that estimation bias is clearly a challenge to 
applications of extreme value statistics. Also, bias may relate to the magnitude of 
the tail index then causing systematic bias in capital allocation. When for example, 
relatively low 0‘s tend to be over- and relatively high o^s tend to be under- 
estimated as indicated e.g. by the results in Wagner and Marsh (2000), even under 
the moderate probability level of 1 percent, substantial over- and under-allocation 
with risk capital may arise in management applications. 

Recent methodological approaches give promising results in reducing tail index 
estimation bias. The methods include the exponential regression approach by 
Feuerverger and Hall (1999) and Beirlant et al. (1999) with a refinement by Mat- 
thys and Beirlant (2001). Simulation results of the latter authors indicate that the 
estimator overcomes typical problems of the Hill plot under various model distri- 
butions. However, the authors do not find a dominant quantile estimation tech- 
nique. Other methodological approaches are based on bias-corrected versions of 
the Hill estimator. Gomes and Oliveira (2001) analyze bootstrap correction meth- 
ods. Huisman et al. (2001) use Hill plot regression approach with an application 
given in Huisman et al. (1998). The analysis in Wagner (2002a) relies on the Hill 
plot based maximum occupation time estimator of Drees et al. (2000). 

Despite the critical view put on VaR estimation in this contribution, one should 
point out that overall progress in extreme value theory and research in finance dur- 
ing recent years has much improved quantitative risk management methods. Keep- 





327 



ing potentials and limitations of those methods in mind will allow further im- 
provements in handling practical risk management tasks. 



References 

Bassi F, Embrechts P, Kafetzaki M (1998) Risk Management and Quantile Estimation. In: 
Adler RJ, Feldman RE, Taqqu MS (eds) A Practical Guide to Heavy Tails. Birkhauser, 
Boston, pp 1 1 1-130 

Beirlant J, Dierckx G, Goegebeur Y, Matthys, G (1999) Tail Index Estimation and an Ex- 
ponential Regression Model. Extremes 2: 177-200 

Beirlant J, Vynckier P, Teugels J L (1996) Practical Analysis of Extreme Values. Leuven 
University Press 

Coles SG (2001) An Introduction to Statistical Modeling of Extreme Values. Springer, 
London 

Danielsson J, de Vries CG (1997) Tail Index and Quantile Estimation with Very High Fre- 
quency Data, Journal of Empirical Finance 4: 241-257 

Drees H, de Haan L, Resnick S (2000) How to Make a Hill Plot. Annals of Statistics 28: 
254-274 

Embrechts P (2000) Extreme Value Theory: Potential and Limitations as an Integrated Risk 
Management Tool. Derivatives Use, Trading & Regulation 6: 449-456 

Embrechts P, Kliippelberg C, Mikosch T (1997) Modelling Extremal Events for Insurance 
and Finance. Springer, New York 

Emmer S, Kliippelberg C, Triistedt M (1998) VaR - Ein Mass fur das extreme Risiko. Solu- 
tions 2: 53-63 

Feuerverger A, Hall P (1999) Estimating a Tail Exponent by Modelling Departure from a 
Pareto Distribution. Annals of Statistics 27: 760-781 

Frey R, McNeil AJ (2000) Estimation of Tail-Related Measures for Heteroscedastic Finan- 
cial Time Series: An Extreme Value Approach. Journal of Empirical Finance 7: 271- 
300 

Gomes MI, Oliveira O (2001) The Bootstrap Methodology in Statistics of Extremes — 
Choice of the Optimal Sample Fraction. Extremes 4: 331-358 

Hill BM (1975) A Simple General Approach to Inference about the Tail of a Distribution. 
Anals of Statistics 3 : 1163-1 174 

Huisman R, Koedijk KG, Kool CJM, Palm F (2001) Tail-Index Estimates in Small Sam- 
ples. Journal of Business and Economic Statistics 19: 208-216 

Huisman R, Koedijk KG, Pownall R (1998) Fat Tails in Financial Risk Management. Jour- 
nal of Risk 1: 47-62 

Jorion P (1997) Value at Risk: The New Benchmark for Controlling Market Risk. 
McGraw-Hill, New York 

Kearns P, Pagan A (1997) Estimating the Density Tail Index for Financial Time Series. Re- 
view of Economics and Statistics 79: 171-175 

Kliippelberg C (2002) Risk Management with Extreme Value Theory. Presentation Manu- 
script, Munich University of Technology 

Kramer W, Runde R (1996) Stochastic Properties of German Stock Returns. Empirical 
Economics 21: 281-306 

Lauridsen S (2000) Estimation of Value at Risk by Extreme Value Methods. Extremes 3: 
107-144 





328 



Locarek-Junge H, Strassberger M, Wagner N (2002) Wann beginnt die Krise? — Ein Blick 
auf Finanzmarktrenditen. Forthcoming in: Blum U et al. (eds) Krisenkommunikation. 
Teubner, Stuttgart 

Lux T (2001) The Limiting Extremal Behavior of Speculative Returns: An Analysis of In- 
tra-Daily Data from the Frankfurt Stock Exchange. Applied Financial Economics 1 1 : 
299-315 

Matthys G, Beirlant J (2000) Adaptive Threshold Selection in Tail Index Estimation. In: 
Embrechts P (ed): Extremes and Integrated Risk Management. Risk Books, London, 
pp 37-49 

Matthys G, Beirlant J (2001) Extreme Quantile Estimation for Heavy-Tailed Distributions, 
Working Paper, University of Leuven 

McNeil A (1998) Calcualting Quantile Risk Measures for Financial Time Series using Ex- 
treme Value Theory. Preprint, ETH Zurich 

Oehler A (ed) (1998) Credit Risk and Value-at-Risk Altemativen. Schaeffer-Poeschel, 
Stuttgart 

Ridder T. (1998) Basics of Statistical VaR-Estimation. In: Bol G, Nakhaeizadeh G, 
Vollmer K-H (eds) Risk Measurement, Econometrics, and Neural Networks. Physica, 
Heidelberg, pp 1 6 1 - 1 87 

Wagner N (2002a) The Hill Estimator in Financial Risk Assessment and an Application to 
Extremal Exchange Rate Risk. In: Batten JA, Fetherston TA (eds) Financial Risk and 
Risk Management. Elsevier, Amsterdam, pp 173-187 

Wagner N (2002b) Value-at-Risk for Financial Assets Determined by Moment Estimators 
of the Tail Index. Forthcoming in: Opitz O, Schwaiger M (eds) Classification in Man- 
agement Science. Springer, Berlin 

Wagner N, Marsh TA (2000) On Adaptive Tail Index Estimation for Financial Return 
Models. Working Paper No. RPF-295, U.C. Berkeley 





Project Risk Management by a Probabilistic 
Expert System 



Andre Ahuja and Wilhelm Rodder 

FemUniversitat in Hagen, Profilstr. 8, 58084 Hagen, Germany 
andre.ahuja@femuni-hagen.de, wilhelm.roedder@femuni-hagen.de 



Abstract Efficient applications of expert systems to project risk management 
problems are seldom, if not unusual. In this paper we overcome this lack by using 
the probabilistic expert system shell SPIRIT. The mle-based shell’s power in con- 
ditioning, inference and reasoning under incomplete information will work well 
on risk estimation and classification. A key characteristic of SPIRIT is the possi- 
bility to integrate project objectives into the risk management model. So known 
dependencies between risk variables can be modelled by the user if known before- 
hand, whereas hidden dependencies might be detected by the proper system. Be- 
cause of the novelty of projects they suffer from incomplete information and it is 
this incompleteness which SPIRIT handles at high information fidelity. Further- 
more undirected inference is possible, due to the undirected graphical stmcture in 
which knowledge is acquired and processed. So, in an early-state risk management 
situation - where the final model in terms of certain variables and/or their respec- 
tive dependencies is not yet available - preliminary risk analyses and even rec- 
ommendations for adequate risk treatment measures are possible, too. A middle 
size product developement example, including 12 binary variables and 34 rules, 
shows the inferential power of SPIRIT. 



1. Introduction 

Novelty, complexity and middle/long size planning periods make a project 
highly suffer from uncertainty [3], [10]. The novelty implies a lack of experience 
and concrete numbers so that project risk management (PRM) must be able to 
identify possible project risks, to evaluate and treat them, if necessary [6], [12]. 
Because of the complexity of this process the use of information processing 
computer systems is undisputed [11]. In the present paper we demonstrate how 
expert systems can support this process. An expert system is a computer program 
which adapts knowledge about a certain domain and responds to user questions in 
special situations. Typical examples are related on medical or technical 
diagnostics [5], [1]. 

After a short introduction to the probabilistic expert system shell SPIRIT in 
section 2, we develope the model of a PRM situation in section 3 and then study 
different scenarios in section 4. The objective is the answer to the question which 




330 



parts of project risk analysis can be efficiently supported by the expert system. 
Section 5 is a summary and points out possible future research. 



2. The Expert System Shell SPIRIT 

Partial knowledge about projects and specially about project risks shall be mo- 
delled in SPIRIT under the SPIRIT syntax [8], thus forming a PRM knowledge 
base. This needs the definition of a finite set of finite valued variables plus a set of 
probabilistic rules. A probabilistic rule is an expression of the typ "x - read 
A implies B with probability x. Here A and B are propositional sentences built by 
literals <Variable>=<value> and linked by the operators a (and), v (or), -i (not), 
=> (implies) and respective parentheses. To each such rule there is attached a 
probability x with which the rule is estimated to be true in the domain under con- 
sideration. The set of such probabilistic rules is the rule basis R. 

Once all probabilistic rules are provided to the system, it adapts them by gener- 
ating a probability distribution on the set of all configurations (O, elements of all 
variables’ ranges cross product. This distribution respects the given rules and does 
not generate any not intented dependency between the variables involved; it obeys 
the principle of information fidelity [9]. Mathematically speaking we choose the 
probabilistic extension W{R), which is the set of all probability distributions re- 
specting the linear restrictions given by the set of rules R, solving 

P* = arg(max{i/(P)|P€fT(i?)}). (1) 

The function // of a distribution P is its entropy 

H{py=-Y,p{( 0 )ld p(co), (2) 

here Id denotes the dual logarithm. 

Solving (1) yields the distribution P* which contains exactly the information 
given by the rule basis and this information measures in [bit\, c.f [4]. The system 
SPIRIT presents the result of (1) in the form of an independency graph including 
the marginal distributions on the variables, see below. In this graph you might 
check mutual impacts among variables when focussing different states. 



3. Building the Knowledge Base 

We consider a product developement project and therefore define the following 
variables: 

• product characteristics: QUALITY, INNOVATION 

• project situation: ROLL_OUT, BUDGET, CAPACITY, DURATION, 
PROCEED, LICENSE 

• valuation/score: ECONOMY, IMPORTANCE, SALES, CONTINUE. 





331 



Consider the case where the expert estimates a high probability - 85%, say - for a 
late ROLL_OUT at the project final, due to a risky license situation. This knowl- 
edge is informed as the probabilistic rule 

0.85 LICENSE=doubtful a PROCEED=fmish =» ROLL_OUT=delay (3) 

Such a rule represents a linear restriction [4] and thus fixes a hyperplane in the 
probability space. In general the set of all rules R does not determine a unique ex- 
tension W(R), but rather a convex polyhedron. Solving (1) we pick out one of 
these distributions, namely the one with the highest information fidelity. As such 
rules in their clear syntax are self-explicatory, we give an extract of i? in figure 1. 

Fig. 1: An extract of R in SPIRIT’ s rule- window 



0 0^ ECONOMY-profi table COMTINUE-no.doubt 

1 0i6 E0II_0UT"sure A ECCHOMT- prof i table ^ COHTIHUE-no^doubt 

2 OiS LICEHSE“doubtful A PROCEED"! ini sh ^ R0L1_0UT" delay 

3 OiO LICEHSE'doubtful ^ PROCEED -start 

4 0^ ECONOMY=not_profitahie A SUES = doubtful ^ COKT1H0E= check 

5 0^ ElC0N0HY=profi table A 5ALE5=suxe A IMPORT ANCE“st rat egic ^ COHTIKTJE“tio_dDiibt 

6 0^ ECONOMT=prof itable A SAUS^sure A IMP0RTAHCE'nic;e2haye ^ COHTIinJE*nQ_doubt 

7 0,75 IMPORTMCE-strategic ^ COHTIME-no_doubt 

8 OiS PROCEED- start A ECOHOMY-nDt_prof itable =♦ COMTINUE-check 

9 0,70 IKH07ATI0H-konservativ A BUDGET-slack ^ LICEHSE-socure 

10 0,80 PROCEED-stnxt A ROLL_OUT-delay 4 COHTIHUE-check 

11 0,75 QUALITY-?ell 

12 OiS SAIES=doubtf ul A DURATIOM-shorttsm ^ ECONOMY *not_prof itable 

13 OiO PROCEED=start ^ CAPACITY* slack 

The set Q of all configuration co of the given 12 binary variables counts 
n = 2^^ =4.096 elements. The uniform distribution on Q has maximum entropy: 
Id 4.096 = 12 [bit]. Solving (1) and respecting all 34 rules we find a distribution P* 
with an entropy of 5.24 [bit], significantly inferior than the initial one. The differ- 
ence 12 - 5.24 = 6.76 [bit] is a measure for the knowledge amount adapted in the 
system [2]. This significant uncertainty reduction we take as a good reason to trust 
in the system's answers as demonstrated in the next section. Figure 2 shows the 
corresponding independency graph and the respective marginal probabilities on 
the variables' values. 




332 



Fig. 2: Independence graph for the knowledge base „RISKMGMT.spi“ 




4. Risk Analysis by SPIRIT 

After solving (1) of the last section the system disposes of P*, the knowledge 
base about the domain. This general knowledge can be modified focussing a cer- 
tain situation. We use this concept of focussing a situation to study the impact of a 
probability change of the variable LICENSE upon ROLL_OUT and ECONOMY. 
More concrete: We are interested in learning about the effect which the risk vari- 
able LICENSE exerts on the project's deadline and on its economy. And this ef- 
fect shall be analysed for the developement of a conservative product of high qual- 
ity and for a secure sales situation. Thus putting 

QUALITY=well, DURATION=shortterm, (4) 

INNOVATION=conservative, SALES=secure 

yields the desired focus which now permits the analysis, shown in table 1. 



Table 1. Objective ROLL_OUT and ECONOMY contingent on LICENSE situation 



Variable 


LICENSE 


ROLL_OUT 


ECONOMY 


Value 


secure 


doubtful 


sure 


delay 


profit . 


not__prof it . 


Probabilities, condi- 
tioned by focus (4). 


0.79 


0.21 


0.79 


0.21 


0.45 


0.55 


Add. condition by 
LICENSE^secure. 


1.00 


0.00 


0.87 


0.13 


0.46 


0.54 


Change 






+10% 


-38% 


+2.2% 


-1.8% 


Add. condition by 
LICENSE^doubtf . 


0.00 


1.00 


0.47 


0.53 


0.38 


0.62 


Change 




1 


-41% 


+152% 


-38% 


+13% 



The first line shows the respective marginal probabilities for the focussed situa- 
tion. The numbers represent average experiences about such projects in general: 










333 



There is a 79% probability of a sure ROLL_OUT and a 45% probability of 
profit. The system is now ready for an analysis with respect to the variable 
LICENSE. We observe a 10% higher sure ROLL_OUT and a 38% decrease of a 
delay’s probability with a secure license situation. The ECONOMY attributes’ prob- 
abilities undergo a likewise modest 2.2% and -1.8% change. Things turn out 
much more significant with a doubtful license situation. Here we have a strong 
negative effect upon the ROLL_OUT expectation and upon the expected economy 
of the project. 

SPIRIT permits the consideration of utilities u for each variable thus allowing 
the calculation of a (numerical) expected value for the model's objectives. This 
might guarantee a more profound valuation in decision problems than the mere 
use of probabilities. In the above model for example, attributing 
w(prof itable) = 100 and w(not jirof itable) = -50, the exepted utility of 
16.91 for the initial situation raises to 19.72 for a severe and decreases to 6.51 for 
a doubtful license situation. The project manager should be aware of the fact that, 
in the loss case - doubtful license situation, which would not be treated - there is a 
severe 61.5% decrease of profit. 

The expert system shell SPIRIT also allows focussing virtual=not certain situa- 
tions. Going back to table 1 we might be interested in analysing the situation for 
the case in which the information about a doubtful license situation would be 
vague. What if the attribute doubtful has a probability of 80% rather than 100%. 
Such vagueness can be of subjective - linguistic unprecision - or of statistical - 
80% of all licenses are there - nature. In either case the system responds like in 
table 2. 



Table 2. (continued, virtual license situation) 



LICENSE 


ROLL_OUT 


ECONOMY 


secure 


doubtful 


sure 


delay 


profitable 


no t_pr of itable 


0.20 


0.80 


0.55 


0.45 


0.39 


0.61 






-30% 


+ 114% 


-13% 


+11% 



Please notice that virtual focussing is an important instrument in real world 
situations. On the one hand the effect is not as extreme as the second focussing 
LICENSE=doubtful shown in table 1. But on the other hand it may be closer to 
the subjective appraisal of the decision maker, and still the impact on ROLL_OUT 
and ECONOMY obviously is significant. 



5. Summary 

The formulation of expert knowledge as a set of probabilistic rules together 
with the principle of maximizing entropy is a mighty instrument to derive an unbi- 
ased knowledge base about a domain such as project risk management, even under 
partial knowledge, only. The expert system shell SPIRIT permits the focus of spe- 
cial scenarios as well as a sensitivity analysis between the model variables in- 





334 



volved. This playing through of different future project situations includes even 
virtual or vague situations, respectively. Utilities instead of mere probabilities to 
evaluate such situations also can be handled by the system, if disposable. The 
reader interested in studying the model RISKMGMT.spi is invited to visit us un- 
der www.xspirit.de. 



References 

1. Breese, J.S., Heckermann, D. (1996): Decision-theoretic Troubleshooting: A 
Framework for Repair and Experiment, in: Uncertainty in Artificial Intelligence 12, 
Morgan Kaufman Publishers, San Francisco, California, p. 124-132. 

2. Kulmann, F. (2002): Wissen und Information in konditionalen Modellen, Deutscher 
Universitats-Verlag, Wiesbaden, p. 56-58. 

3. Madauss, B.-J. (1984): Projektmanagement, Stuttgart. 

4. Meyer, C.-H. (1998): Korrektes SchlieBen bei unvollstandiger Information, Peter 
Lang, Frankfurt a.M. 

5. Lauritzen, S.L., Thiesson, B., Spiegelhalter, D.J. (1994): Diagnostic systems by 
model selection: a case study. Lecture Notes in Statistics, 89, Springer, p. 143-152. 

6. Raftery, J. (1994): Risk analysis in project management, E.&F.N, Spon, London. 

7. Reucher, E., Rodder, W.( 2001): Modellierung von Entscheidungsproblemen unter 
Verwendung von probabilistischen Konditionalen, in Fleischmann, B. et Al.: Opera- 
tions Research Proceedings 2000, Springer, p. 254-259. 

8. Rodder, W. (2000): Conditional Logic and the Principle of Entropy, Artificial Intel- 
ligence, 117, p. 83-106. 

9. Rodder, W. (2001): Knowledge Processing under Information Fidelity, Proc. IJCAI 
2001 - Seventeenth International Joint Conference on Artificial Intelligence, Seattle, 
Washington, p. 749-754. 

10. Rodder W., Ahuja, A. (forthcoming): Projektmanagement: Konzept, Aufgaben 
Techniken, Kohlhammer, Stuttgart. 

11. Schon, D., Diederichs, M., Busch, V. (2001): Chancen- und Risikomanagement im 
Projektgeschaft, in: Controlling (2001) 7, p. 379-387. 

12. Schnorrenberg, U., Goebels G. (1997): Risikomanagement in Projekten, Braun- 
schweig, Wiesbaden. 

13. Internet: www.xspirit.de 





Regulatory Impacts on Credit Portfolio 
Management 



Ursula Theiler*, Vladimir Bugera**, Alla Revenko**, Stanislav Uryasev** 

*Risk Training, Carl-Zeiss-Str. 11, D-83052 Bruckmuehl, Germany, 
mailto: theiler@risk-training.org. 

Risk Management and Financial Engineering Lab, University of Florida, 
303 Weil Hall, Gainesville, FL 3261 1-65, USA, 
mailto: bugera@ufl.edu, alla@ufl.edu, uryasev@ufl.edu. 



Abstract Efficient credit portfolio management is a key success factor of bank 
management. Discussions of the new capital adequacy proposals by the Basle 
Committee on Banking Supervision enlighten the necessity to consider the credit 
risk management both from the internal and the regulatory point of view. We in- 
troduce an optimization approach for the credit portfolio that maximizes expected 
returns subject to internal and regulatory risk constraints. With a simplified bank 
portfolio we examine the impact of the regulatory risk limitation rules on the op- 
timal solutions. 



1 Introduction 

Efficient credit portfolio management is a key success factor of bank management. 
In an adverse market environment and intensifying competition banks are exposed 
to increasing risks and decreasing return margins of their credit portfolio, while 
bank shareholders are demanding higher risk premiums for their invested capital. 
The ability to identify risk-return optimal portfolios becomes a fundamental ele- 
ment of credit portfolio management. The recent discussions of the Basle Commit- 
tee on Banking Supervision enlighten the necessity to manage credit risk simulta- 
neously from an internal and a regulatory perspective. 

In this paper, we give a survey of a new optimization algorithm that determines 
risk-return efficient credit portfolios under internal and regulatory credit risk con- 
straints. We formulate the optimization problem for the credit portfolio based on 
the new risk measure. Conditional Value at Risk, and derive risk-return ratios for 
the optimal portfolios (chapter 2). With an application example, we analyze the 
risk-return structure of an optimal portfolio. We examine the impact of the regula- 
tory risk limitation rules and visualize how they may lead to inefficiencies in the 
credit portfolio management (chapter 3). 




336 



2 Optimization Approach 

2.1 Definition of the CVaR Risk Measure 

The risk measure Value at Risk (VaR), commonly applied in finance, lacks the 
sub-additivity property, when return distributions are not normal. This means that 
the diversification of the portfolio may increase portfolio VaR. A similar percen- 
tile risk measure. Conditional Value at Risk (CVaR) does not have this drawback. 
The term Conditional Value-at-Risk was introduced in [5]. For continuous distri- 
butions, CVaR is equal to the conditional expectation beyond VaR, see [5]. How- 
ever, for general distributions, it is a weighted average of VaR and the conditional 
expectation beyond VaR, see [6]. CVaR can be applied to measure loss risk from 
any asymmetric and discontinuous loss distribution with discrete probabilities and 
it obeys the property of coherence, see [1,4,6], a set of axioms that a risk measure 
should meet from the point of view of a regulator [2]. CVaR has been proved to be 
appropriate for credit portfolio risk measurements [4,5,6]. 

Let x=(xi,...,Xn)’ be a vector of positions of credit assets of a portfolio, and 
y=(yi,...,yn)’ be a vector of the corresponding market prices. For continuous dis- 
tributions, we define CVaR deviation CVaR^(L(x,y)) of the portfolio loss risk as 

CVaJl^{L(x,y)) = E[L(x,y) | L(x,y) > VaR„(L(x,y))], (1) 

where the loss function L(x,y) is the difference of the uncertain portfolio values 
and the expected value of the portfolio, i.e. L(x,y)=E[y]’x-y’x, and VaR(L(x,y)) is 
the a-quantile of the loss function L(x,y).’ 



2.2 Formulation of the Optimization Model 

The optimization problem models the basic goal of the credit portfolio manage- 
ment. We maximize the expected portfolio return p(x)=^’x under internal and 
regulatory loss risk constraints [8], with x the decision vector and ^=(pi,...,p^)’ 
the vector of the expected returns of single assets. The internal loss risk is meas- 
ured by the CVaR deviation of the portfolio loss according to equation (1) and is 
constrained by the maximal amount of economic capital available, denoted as 
ec_cap_max. Based on the optimization algorithm of Rockafellar/Uryasev [5], the 
CVaR constraint is approximated by a set of linear constraints, leading to a linear 
optimization problem. To implement the algorithm, as input data, we use a sample 
of market price scenarios y„ yx of the vector y? The regulatory credit risk is 
measured by the regulatory risk based capital ratios, regjcap = (reg_capi,..., 
reg_capn)’ and is limited by the available regulatory core and supplementary capi- 
tal, denoted by reg_cap_max. The area of the feasible solutions is defined by up- 



^ In the case of nonzero probability atom at the a-quantile, CVaR is defined as the 
weighted average of VaR and the conditional expectation beyond the VaR [6]. 

^ In the application example in the next chapter, these market price scenarios are generated 
by a Monte Carlo-Simulation according to the CreditMetrics approach of J. P. Morgan. 





337 



per and lower position bounds, the vectors low_bound and up_bound. We solve 
the following linear optimization model: 



Objective Function p(x) =ju'x = 

j=i 

Constraint # 1 : Internal Risk Constraint 

1 1 K > 

® — T ■ IT ^ ec_cap_max, 

(1— a; 

(ii) L(x,y,^)-q<z^,k=l,...^ >■ 

(iii) -z,<0,k=l,...J^ 

(iv) qe9l 

Constraint # 2: Regulatory Risk Constraint 

(v) reg_cap’x < reg_cap_max. 

Constraint # 3: Boundaries of the Feasible Solutions 

(vi) low_bound < x < up_bound. 



(4) 



“Internal loss risk 
(CVaR deviation estimate) 
< Economic capital” 



In order to analyze the effects of the regulatory risk constraints on the optimal 
portfolios, we consider the following optimization models (P’) and (P) with and 
without the regulatory risk constraint, accordingly: 

(P): Maximize Objective subject to Constraints # 1 and 3, 

(P’): Maximize Oh]Qcii\Q subject to Constraints # 1, 2, 3. ^ 



2,Z Risk-Return Analysis of the Portfolio Assets 

The contributions of the single assets to the overall portfolio risk and return repre- 
sent basic information for the risk-return analysis of the optimal portfolios. The re- 
turn contribution pj(x) of the j-th asset to the portfolio x is given by the j-th coeffi- 
cient of the return function, i.e. pj(x)=|Lij, j=l,. . .,n. 

We apply the Euler allocation principle to derive the risk contributions of the 
single assets [3,7]: The risk contribution rj(x) of the j-th asset is defined by the 
partial derivative of the portfolio risk measure with respect to the j-th asset. It 
corresponds to the conditional expected loss of the j-th component in the tail of the 
portfolio loss distribution and can be estimated from the given sample of the mar- 
ket prices as the mean of the losses of the j-th asset in the tail of the loss distribu- 
tion [7]. We achieve the following risk contribution rj(x) of the j-th asset, j=l,...,n: 

r (X) = ^ ^ E[Lj(x,y) | L(x,y) > VaR„(L(x,y))], (6) 

d Xj 

where Lj(x,y) = E[y 3 ]Xj -E[yjXj |L(x,y) > VaR„(L(x,y))],j = l,...,n. 

We define the risk-return ratios of single assets, the return on risk adjusted capital 
RORACj(x) of the j-th asset and the return on equity RoEj(x), i.e. the return on the 
regulatory capital, of the j-th asset as 






338 



(i) RORAC^(x) 

fj(x) 

(ii) RoEj (x) = ^ , 

reg_capj 



j = l,...,n, (7) 



3 Application Example 



An ABC Bank consists of three typical credit assets: asset 1 represents high qual- 
ity bonds (Rating AA), asset 2 mortgage loans (Rating BB) and asset 3 retail loans 
(Rating B). 10 units of regulatory capital are available, of which 94% are actually 
in use. The internal risk (CVaR) level may be varied to some extent according to 
the risk policy of the bank. The initial portfolio uses 48 units of the economic 
capital. Our goal is to investigate how the risk-return relations of the initial credit 
portfolio can be improved and how the regulatory risk constraint effects the opti- 
mal portfolios. We applied the optimization models (P) and (P’) with different 
CVaR levels. First, we generated the efficient frontiers and analyzed the overall 
portfolio risk-return relations. Next, we analyzed the risk-return structures of the 
single assets of the optimal portfolios. 

As shown in the Fig. 1, we observe that the regulatory constraint becomes ac- 
tive at the CVaR-level of 39.9 units. At the given capital levels (ec_cap_max=48, 
reg_cap_max=10), the expected portfolio returns can be improved by 0.07 units in 
(P’) and by 0.23 units in (P). This means that without the regulatory constraint an 
additional profit of 0.16 units could be gained. The portfolio RORAC, defined as 
the expected return p(x) divided by the CVaR deviation CVaR^(L(x,y)) of the 



portfolio X, increases from 6.09% to 6.29% in (P’) and to 6.63% in (P). 

We also observe that the ABC bank can generate higher portfolio RORACs by 
lowering the level of internal risk. The maximal RORAC of 6.82% can be reached 
at the interval of ec_cap_max=[34.9,37.7], where the regulatory constraint is not 
active. However, the implementation of a RORAC optimizing strategy would re- 
quire reducing the credit volumes and absolute returns. This might be conflicting 
with other corporate goals and may not be supported by the shareholders. 



Expected Returns 

4.0 

3.0 

2.0 
1.0 



Constiaint #2 
active in (?') 


EfTicient Line (F) 

" Efficient Lina(P”i. 


■... ^ ^ 





^ 




./ i 


, ' ' 1 






^PoitfolioRO^C (P), 
- Portfolio ROiAC (P') 


- ■’T 1 1 1 r*- 



Portfolio RORAC 
9% 

S% 

1 % 



5% 



33 35 37 39 41 43 45 47 ' 49 51 53 55 57 59 61 63 



Portfolio CVaR 



Fig. 1. Efficient Lines and Portfolio RORACs of the Optimization Problems (P) and (P’) 






339 



In order to analyze the risk-return structure of optimal portfolios we first examine 
the positions of single assets, which are represented in fig. 2. The narrow and 
broad lines represent the positions of single assets in the solutions of (P) and (P’), 
respectively. Starting from the minimal CVaR portfolio, the assets are increased in 
the optimal solutions in the order of descending RORACs, as defined in the equ. 7 
(i). ^ When the regulatory constraint becomes active, we observe the effect of capi- 
tal arbitrage: assets with higher RoEs are preferred to assets with higher RORACs, 
and the overall portfolio level of risk is increased. 




Fig. 2. Impact of the Regulatory Constraint on the Optimal Portfolio Structures 



In order to analyze the effect of capital arbitrage more closely, we examine the 
risk-return structure of the optimal portfolio at the initial CVaR level of 48 units, 
as described in the fig. 3. Without the regulatory risk constraint, position of asset 1 
with highest RORAC is increased by 50%, of asset 2 by 28.3% and position of as- 
set 3 with lowest RORAC is reduced by 21.5%. In (P’) asset 1, showing the lowest 
RoE, is increased less than in (P). Position of asset 3 with the highest RoE is in- 
creased, while position of asset 2 with higher RORAC but lower RoE than asset 3 
is reduced. The riskier assets are weighted higher in (P’), resulting in lower returns 
at the given C VaR-level and a sub optimal use of the economic capital, as could 
be observed in the fig. 1 above. 




^ Although the RORACs of the single assets of the optimum portfolios x* differ slightly 
along the efficient line, their ranking remains constant, i.e. RORAC3(x*)< RORAC2(x )< 
RORACi(x). 
















340 



4 Conclusion 

We have introduced an algorithm that maximizes the expected returns of a credit 
portfolio subject to the internal and regulatory risk constraints. It is based on the 
new risk measure, CVaR, which is appropriate for credit portfolio risk measure- 
ment and can be solved by linear programming methods. The optimization model 
allows to spot intervals of efficient use of both capital resources, the available eco- 
nomic and regulatory capital, and of highest portfolio RORACs. It identifies 
“unrealized” profits due to the regulatory risk constraint. We conducted risk-return 
analyses of single assets of the optimal portfolios and found evidence of capital 
arbitrage, that leads to sub optimal portfolios under the regulatory risk limitation 
rule, as assets of higher RoE but higher risk are weighted higher than assets of 
lower risk and higher RORACs. 

In a follow-up study we will pursue the application example for the Basle II 
Accord. We will analyze the impact of new risk weights on optimal credit portfo- 
lios. Further, we will investigate how the internal and regulatory risk-return struc- 
tures of the single assets influence the optimal solutions when both capital con- 
straints are active. Also, the intuitive statement that the single assets are increased 
in the optimum solutions in the order of descending RORACs in (P) can be for- 
mally investigated, as the risk contributions are not explicitly modeled in the op- 
timization algorithm. Another point of interest is to develop an optimization algo- 
rithm that calculates the RORAC-optimal portfolios of (P) and (P’) in one 
optimization run. 



References 

[1] Acerbi, C. and Tasche, D., On the coherence of expected shortfall. Working 
paper (2001), can be downloaded from http://www.gloriamundi.org. 

[2] Artzner, Ph., Delbaen, F., Eber, J.-M., Heath, D. (1999): Coherent Measures of 
Risk, Mathematical Finance, Vol. 9, No. 3, pp. 203-228. 

[3] Patrik, G., Bemegger, S., Riiegg, M.B. (1999): The use of risk adjusted capital 
to support business decision making, in: Casualty Actuarial Society (Hrsg.), Casu- 
alty Actuarial Society Forum, Spring 1999 Edition, Baltimore. 

[4] Pflug, G.Ch. (2000): Some Remarks on the Value-at-Risk and the Conditional- 
Value-at-Risk, in: Uryasev, S. (Ed.), Probabilistic Constrained Optimization: 
Methodology and Applications, Kluwer Academic Publishers, pp. 272-281. 

[5] Rockafellar, R. T. and Uryasev, S. (2000): Optimization of Conditional Value- 
At-Risk, The Journal of Risk, Vol. 2, No. 4, pp. 21-51. 

[6] Rockafellar, R.T. and Uryasev, S. (2002): Conditional Value-at-Risk for Gen- 
eral Loss Distributions, Journal of Banking and Finance, Till. 

[7] Tasche, D. (1999): Risk Contributions and Performance Measurement, Work- 
ing Paper, Technische Universitaet Muenchen. 

[8] Theiler, U. (2002): Optimization Approach for the Risk-Retum-Management 
of the Bank Portfolio, Wiesbaden (In German). 





Verfahren zur Risikokapitalallokation im 
Eigenhandel von Banken 



Mario StraBberger 

Lehrstuhl fur Betriebswirtschaftslehre, insbes. Finanzwirtschaft und Finanzdienst- 
leistungen, Technische Universitat Dresden, MommsenstraBe 13, D-01062 Dres- 
den, E-Mail: strassberger@finance.wiwi.tu-dresden.de 



1 Einfuhrung 

Der Value-at-Risk (VaR) als MaB zur Quantifizierung von Marktpreisrisiken bie- 
tet neue Moglichkeiten der Erfolgs- und Risikosteuerung, speziell im Eigenhan- 
delsgeschaft der Bank. Die bisherige wissenschaftliche Auseinandersetzung kon- 
zentrierte sich v.a. auf Modelle zur Schatzung des VaR. Der nachste Schritt 
besteht in der Konstruktion eines auf dem VaR-Modell aufbauenden VaR- 
Limitsystems. Das ist zum einen ein aufsichtsrechtliches Erfordemis und zum an- 
deren eine okonomische Konsequenz. 

Ein VaR-Limitsystem ist ein mehrstufiges, hierarchisch gegliedertes System 
von VaR-Limiten, dass der simultanen Begrenzung von Marktpreisrisiken aller 
Portfolios und aggregierten Portfolios des Eigenhandels dient. Es muss so gestaltet 
sein, dass es gleichzeitig eine optimale Allokation von Risikokapital erzeugt. Als 
Risikokapital wird diejenige Kapitalreserve der Bank bezeichnet, die mit hoher 
Wahrscheinlichkeit ausreicht, um unerwartete Verluste des Eigenhandels 
auszugleichen. Es stellt eine knappe Ressource dar. 

Das bekannte analytische Delta-Normal-Modell zur VaR-Schatzung wird in 
diesem Beitrag auf neue Fragestellungen der Marktpreisrisikosteuerung angewen- 
det. Dabei stehen die Fragen der effizienten Allokation von Risikokapital und der 
Konstruktion eines konsistenten VaR-Limitsystems fur die hierarchische Portfo- 
liostruktur des Eigenhandels von Banken im Mittelpunkt. Das VaR-Modell wird 
zu diesem Zweck mit einem Optimierungsansatz verkniipft, der die 
risikoadjustierte Profitabilitat des Eigenhandels maximiert. 



2 Delta-Normal-Modell zur Value-at-Risk-Schatzung 

Der zukiiiiftige potentielle Verlust := IF, - eines Portfolios wird als ne- 
gative Anderung seines Marktwertes definiert. Der VaR als bei gegebener 
Wahrscheinlichkeit p und Haltedauer H prognostizierte Verlustschranke ist de- 
finiert durch das p -Fraktil der Verteilungsfunktion von [3, 4, 6] 




342 



YaR,^,=F-l(p). (1) 

Es bezeichne den Vektor marktwertrelevanter Risikofaktoren und den 

Vektor stetiger Renditen 7.,^^ = InT?. ~ ^ • Der Portfolioverlust 

ist dann eine Funktion =L(F,^^). Das Delta-Noraial-Modell arbeitet mit 
zwei zentralen Approximationen. Es wird erstens angenommen, dass die un- 
abh^gig und identisch gemeinsam normalverteilt sind mit Erwartungswertvektor 
//j. und Kovarianzmatrix Zy . Zweitens wird Z(F,^^) als lineare Funktion appro- 

ximiert. Mit Rf = diag(i?,) und S, mit S..^ =dWfdR..^ , i = , gilt die Ver- 

lustfunktion 

L{Y,^„) = -6.'RX„- ( 2 ) 

Aus den Approximationen folgt, dass der Portfolioverlust normalverteilt ist mit 
Erwartungswert und Varianz a].^=5,R^ IyR^*8/ . Fur den 

VaR resultiert die Schatzung 

VaR„„(p) = //,,+cr,,z^. (3) 

ist das p -Fraktil der Standardnormalverteilung. Uber die Korrelationsmat- 
rix =(Py,J, 7,^ = 1, - 5^5 der Portfolioverluste lassen sich die im Vektor 
VaR,^^ zusammengefassten VaR-Schatzer von K Portfolios aggregieren [1, 4] 

VaR„„ =VvaR,,„'/^VaR,,, . (4) 



3 Risikokapitalailokation 

Fragestellungen der Risikokapitalailokation und der Konstruktion eines hierarchi- 
schen VaR-Limitsystems wurden bisher behandelt [5, 7, 8, 10], ohne konkrete L6- 
sungsalgorithmen vorzuschlagen. Im folgenden wird eine Portfoliohierarchie be- 
trachtet, wie sie im Eigenhandel einer Bank denkbar und in Abb. 1 vereinfacht 
dargestellt ist. Innerhalb dieser Portfoliohierarchie mit insgesamt K Portfolios 
werden J<K Basisportfolios betrachtet, die iiber K-J-l Zwischenportfolios 
zum Gesamtportfolio aggregiert werden. 

Fiir den Eigenhandel wird ein bestimmter Risikokapitalbetrag fur ein bestimm- 
tes Zeitintervall als gegeben angesehen. Intertemporale Probleme der Risikokapi- 
talverteilung werden nicht betrachtet. Das bereitgestellte Risikokapital wird als 
Gesamt-VaR-Limit interpretiert. Die Allokation des Risikokapitals erfolgt mit Hil- 
fe von VaR-Limiten. Ziel ist es, ein VaR-Limitsystem zu konstruieren, dass fol- 





343 



VaRs VLg 




Oi O2 W2 Oi fV 3 <94 fV 4 9 s ^5 



Abb. 1. Beispiel einer Portfolio- und VaR-Limit-Hierarchie 

genden Kriterien geniigt [2, 9]. Das aggregierte VaR-Limit des Eigenhandels darf 
das Gesamt-VaR-Limit zu keinem Zeitpunkt uberschreiten. Die Einhaltung der 
VaR-Limite in den dezentralen Basisportfolios muss die Einhaltung des VaR- 
Limits auf alien Aggregationsebenen sicherstellen. Die VaR-Limite der Basisport- 
folios sollen so hoch wie moglich gewMilt werden. Sie sollen eindeutig und unab- 
hangig von den VaR-Realisationen der anderen Basisportfolios gelten. 

Zur Losung dieser Aufgabe wird der Delta-Normal-Modellrahmen wie folgt 
modifiziert und erweitert [9]. Uber alle Basisportfolios werden N Finanzinstru- 
mente I gehandelt. Es sei der Vektor der Marktwerte A, 

aller Finanzinstrumente. Zur Abbildung der Struktur des Basisportfolios i = 1,...,J 
beziiglich der N Finanzinstrumente wird der Strukturvektor 0. eingeffihrt. Der 
Marktwert des Basisportfolios i ergibt sich damit zu 

WXR,) = e;w(R,). (5) 

Fiir den Strukturvektor konnen weiterhin mengenm^ige Begrenzungen des 
Anlageuniversums u. < 0. < o. mit Untergrenzen u. und Obergrenzen o. einge- 
fiihrt werden. Damit ist es zusatzlich moglich, Volumenlimite in das Modell zu in- 
tegrieren. Um die Gesamtstruktur zu erfassen, wird aus den Strukturvektoren der 
Basisportfolios zunachst die Matrix 0j =(0^,...,0j) erzeugt. Zur Abbildung der 

Portfoliohierarchie wird die Matrix T eingefiihrt, die aus den Elementen 0 und 1 
besteht. Die Finanzinstrumente-Struktur der gesamten Portfoliohierarchie ergibt 
sich damit zu 



e= 0 j = {e„...,e„...,e,). ( 6 ) 

Die Struktur der nachst hoheren Aggregationsebene ergibt sich jeweils als 
Summe der Strukturen der darunter liegenden Portfolios; 6^ gibt die Struktur des 
aggregierten Handelsportfolios wider. Entlang der so erfassten Portfoliohierarchie 
ist ein System von VaR-Limiten VL,(p) , i = 1,—,A^ , zu konstruieren, dass den 
oben formulierten Kriterien genugt. Insbesondere muss gelten 

\&K(d„p) < VL,(p) , V/ = . 



(7) 





344 



Aus der gegebenen Kovarianzmatrix der Finanzinstrumenterenditen kann 

mit =diag(H^(i?,)) iiber die Strukturmatrix 0 die Kovarianzmatrix der Ver- 
lustvariablen der gesamten Portfoliohierarchie abgeleitet werden als 






Dies lasst bei zusatzlicher Approximation von = 0 die Schatzung des VaR- 
Vektors fur die Portfoliohierarchie zu 



VaR„„(0,/,) = diag(/^2^). (9) 

Der VaR-Vektor enthMt die VaR-Schatzer aller Basisportfolios und aggregier- 
ten Portfolios. Sein letztes Element ist der aggregierte VaR^^^(0^,p) des Han- 
delsportfolios. Bei exogener Vorgabe von VaR,^^(0j^,p) = Vl.^{p) in Hohe des 
zu allokierenden Risikokapitals kann durch rekursives Vorgehen und Variation 
der Basisportfoliostrukturen eine Menge an VaR-Vektoren abgeleitet werden, de- 
ren letztes Element jeweils VLj^(p) entspricht. Diese Menge wird als Menge zu- 
lassiger Iso-VaR-Limit-Kombinationen bezeichnet. 

Wichtig fur das VaR-Limitsystem ist, dass die VaR(0 ,/?) der Basisportfolios 

(und damit auch aller aggregierten Portfolios) durch die Strukturvektoren 6. pa- 



rametrisiert werden. Der VaR hangt von den im Rahmen des definierten Anlage- 
universums unabhangigen Entscheidungen iiber die Struktur der Basisportfolios 
ab. Fixr das VaR-Limitsystem hat das weitreichende Konsequenzen. Wenn die 
Forderung aus Gl. (7) nicht verletzt werden soil, ist bei der Konstruktion des VaR- 
Limitsystems von denjenigen Basisportfoliostrukturen auszugehen, die zu maxi- 
mal moglichen Korrelationen der Portfolioverluste fiihren. 

Zur Illustration soil folgendes, einfaches Beispiel dienen. Es seien J = 2 Ba- 
sisportfolios betrachtet, in denen getrennt N = M = 2 Finanzinstrumente gehan- 
delt und die zu einem Gesamtportfolio aggregiert werden. Es gilt 

e,={0, oy, 0 ,=(o 0,)^T=r^M. (10) 

Mittels des dargestellten Vorgehens kann der VaR-Velaor abgeleitet werden als 




dargestellte Ellipse. Dabei wurde ^ =-0.5 und VL, =10.0 unterstellt. Da der 

VaR als stets positiv definiert ist, kommen nur VaR-Limit-Kombinationen aus 
dem ersten Quadranten in Frage. Hier liegen in diesem speziellen Fall zwei long- 
Positionen vor. Das Beispiel verdeutlicht im zweiten und vierten Quadranten fur 
jeweils eine long- und eine short-Position aber auch, dass die VaR-Limite der Ba- 
sisportfolios mit zunehmender Korrelation der Portfolioverluste um so geringer 
ausfallen. Damit wird aber auch sofort klar, dass bei unabhangigen Entscheidun- 





345 




Abb. 2. Zweidimensionales VaR- Limit- Allokationsproblem 

gen uber die Struktur der Basisportfolios nur VaR-Limite in Hohe der in diesen 
Quadranten liegenden Kombinationen vergeben werden konnen. Anderenfalls be- 
steht die Gefahr der Verletzung der Forderung aus Gl. (7). 

Zur Selektion einer VaR-Limit-Kombination aus der Menge zulassiger VaR- 
Limit-Kombinationen wurde die Einfuhrung von Relationen v.., 

zwischen den VaR-Limiten der Basisportfolios vorgeschlagen [9]. Die Relationen 
konnen z.B. durch Ergebnis- oder Profitabilitatskennzahlen bestimmt sein. In 
Abb. 2 wurde Vj 2 = VL, / VL 2 = 2 unterstellt. Damit ist im Schnittpunkt der Rela- 

tionsgeraden mit der Ellipse der Iso-VaR-Limit-Kombinationen die Losung des 
Allokationsproblems gegeben. 

Statt der Einfuhrung von VaR-Limit-Relationen wird zur Selektion einer VaR- 
Limit-Kombination im folgenden ein Optimierungsansatz aufgesetzt. Ziel des Al- 
lokationsprozesses sollte es sein, die knappe Ressource Risikokapital seiner oko- 
nomisch effizientesten Verwendung zuzufiihren. Als Optimalitatskriterium wird 
der sog. Risk Adjusted Return on Risk Adjusted Capital (RARoRAC) benutzt [7] 

RARoRAC, = (VL, )/VL, . (12) 

Dabei bezeichnet G. (VL.) = G. - r VL. den risikoadjustierten Nettogewinn des 
Portfolios, der als Differenz aus Nettogewinn und Risikokapitalkosten zu verste- 
hen ist. Der RARoRAC stellt nicht nur einen direkten Bezug zum zugeteilten 
VaR-Limit her, sondem verwendet auch ein risikoangepasstes Ergebnis. Dies er- 
scheint insbes. im Eigenhandel fiir unterschiedliche Zielverzinsungen r. des Risi- 

kokapitals als geeignet. Die optimale Risikokapitalallokation ist erreicht, wenn der 
aggregierte RARoRAC fiir den Eigenhandel bei gegebenem Risikokapital maxi- 
mal wird. Das fiihrt zu folgendem Optimierungskalkul [1] 

RARoRAC^ =^G,(VL,)/^VL'i^VL ->max! (13) 

Als Nebenbedingung gilt 



7vL'/» VL < VL^ . 



(14) 





346 



Bei Verwendung des Lagrange- Ansatzes ergibt sich als Lagrange-Funktion 

L = ^G,(VL,)/7vL'/»,VL +;i(VL^ -ylVL'P,VL) , (15) 

wobei A den Lagrange-Multiplikator bezeichnet. Die reelle Losung des durch 

dL/dX = 0 , 3L/aVL, = 0 , / = (16) 

bestimmten Gleichimgssystems fiihrt eindeutig zu einem globalen Maximum. 
Zum Aufsuchen der Losung konnen bekannte numerische Optimierungsverfahren 
eingesetzt werden. 

Entscheidend fur die Losung des Optimierungsproblems ist zum einen die 
Formulierung der Funktion G.(VL.) und zum anderen die Korrelationsmatrix . 
Zur Modellierung der risikoadjustierten Nettogewinne sind Polynome hoherer als 
zweiter Ordnung denkbar. Okonomisch plausibel konnen s-formige Funktionen 
mit zunachst zunehmendem und dann abnehmendem Grenzgewinn sein. Die Kor- 
relationsmatrix der Portfolioverluste ist zeitlich instabil. Aufierdem ist sie ex ante 
unbekannt. Je hohere Korrelationen jedoch eintreten konnen, desto geringere 
VaR-Limite diirfen vergeben werden. Diese Sicht erkl^ die bekannte Problema- 
tik, dass das Gesamt-VaR-Limit auf aggregierter Ebene haufig nicht vollstandig 
ausgenutzt ist. 



Literatur 

1. Burmester C, Hille CT, Deutsch HP (1999) Risikoadjustierte Kapitalallokation, Beur- 
teilung von Allokationsstrategien iiber einen Optimierungsansatz. In: Eller R, Gruber 
W, Reif M (Hrsg) Handbuch Bankenaufsicht und interne Risikomodelle. Stuttgart, 
S 389-417 

2. Delbaen F, Denault M (2000) Coherent allocation of risk capital. E. T. H. Zurich, 
Ecole des H. E. C. Montreal 

3. Duffie D, Pan J (1997) An Overview of Value at Risk. The Journal of Derivatives 4: 
7-49 

4. Jorion P (2000) Value-at-Risk, The New Benchmark for Managing Financial Risk. 
New York 

5. Kupiec PH (1999) Risk Capital and VaR. The Journal of Derivatives 6: 41-52 

6. Linsmeier TJ, Pearson ND (2000) Value at Risk. Financial Analysts Journal 56: 47-67 

7. Matten C (1996) Managing Bank Capital, Capital Allocation and Performance Meas- 
urement. Chichester 

8. Merton RC, Perold AF (1993) Theory of risk capital in financial firms. Journal of Ap- 
plied Corporate Finance 6: 1 6-32 

9. Ridder T (1999) Konsistente VaR-Limitsysteme. SGZ-Bank, Frankfurt a. M. 

10. Saita F (1999) Allocation of Risk Capital in Financial Institutions. Financial Manage- 
ment 28: 95-111 





Process Optimization via Conventional Factorial 
Designs and Simulated Annealing on the Path of 
Steepest Ascent for a CSTR 



Pongchanun Luangpaiboon^ 

Department of Industrial Engineering, Faculty of Engineering, Thammasat Uni- 
versity (Rangsit Campus), KlongLuang, Pathumthani, 12121, THAILAND 



Abstract This work determines the efficiency of sequential algorithms for auto- 
matic optimization of a chemical process. A method of steepest ascent and an in- 
tegrated approach between the method of steepest ascent and Simulated Anneal- 
ing, are compared on a simulated continuous stirred tank reactor (CSTR) with 
various levels of signal noise. The results suggest that the method of steepest as- 
cent seems to be the most efficient on the CSTR surface at the lower levels of 
noise. However, the integrated approach with the Simulated Annealing element 
works well when the standard deviation of the noise is at higher levels. Although 
the average, the standard deviation of the greatest actual concentration of the 
product and percentage of sequences ended at the optimum from the integrated al- 
gorithm are better, it needs more runs, on average, to converge to the optimum 
when compared. 



1 introduction 

The objective of Response Surface Methodology is to describe how the response 
of a process varies with changes in k predictor variables (Myers and Montgomery 
1995). The predictor variables determined will depend on the specific field of the 
application. Most industrial processes have some predictor variables. These pre- 
dictor variables can be adjusted by plant operators or by automatic control mecha- 
nisms to enhance the efficiency of the machine. 

There is much current interest in optimization methods with the stochastic ele- 
ment, such as Genetic Algorithms (GA) and Simulated Annealing (SA). The ge- 
netic algorithm is introduced for finding the global maximum on a hypersurface 
(Jennison et al. 1995). The genetic algorithm is a set of rules for searching large 
solution spaces in a manner similar to natural selection in biological evolution 
(Holland 1975; Goldberg 1989). Simulated Annealing has been used in an inter- 
esting analogy between problems in statistical mechanics and optimization. Its 
properties expose useful information and overcome the large and noisy systems. A 
recent study (Luangpaiboon et al. 2000) compared a modified simplex method and 
a genetic algorithm for a variety of response surfaces and levels of measurement 



^ lpongch@engr.tu.ac.th 




348 



noise. The GA appears to work well in the area of the RSM. However, high vari- 
ability of the GA when applied to on-line optimization could be a serious disad- 
vantage (Luangpaiboon 2000). 

The objective of this study is to compare the efficiency of sequential algorithms 
for on-line optimization of a chemical process in the presence of noises. The 
method of steepest ascent and the integrated approach between the method of 
steepest ascent and Simulated Annealing are selected and implemented on the 
CSTR. The context is maximizing the concentration of a desired product of a 
chemical reactor with respect to feed rate, concentration and temperature. 



2 Related Methods 



2.1 Method of Steepest Ascent 

The procedure of steepest ascent is that a hyperplane is fitted to the results from 
the initial 2* designs. The direction of steepest ascent on the hyperplane is then de- 
termined by using principles of least squares and experimental designs. The next 
run is carried out at a point, which is some fixed distance in this direction, and fur- 
ther runs are carried out by continuing in this direction until no further increase in 
yield is noted. When the response first decreases another 't design is carried out, 
centered on the preceding design point. A new direction of steepest ascent is esti- 
mated from this latest experiment. Provided at least one of the coefficients of the 
hyperplane is statistically significantly different from zero, the search continues in 
this direction. Once the first order model is determined to be inadequate, the area 
of optimum is identified via a finishing strategy (Luangpaiboon 2001). 



2.2 Simulated Annealing 

This algorithm is a set of rules for searching large solution spaces in a manner that 
mimics the annealing process of metals (Kirkpatrick et al. 1983). The algorithm 
simulates the behavior of an ensemble of atoms in equilibrium at a given finite 
temperature (Bohachevsky et al. 1986). In case of maximization the procedures of 
this algorithm start at a corresponding initial value of the objective function. The 
new objective value will be then determined. The new solution will be uncondi- 
tionally accepted if its objective value is improved and the process regularly con- 
tinues. However, the stochastic element occasionally allows the algorithm to ac- 
cept the new solution to the problems, which deteriorate rather than improve the 
objective function value. Simulated Annealing also includes a number of parame- 
ters and they have been claimed that affect the efficiency of the algorithm (Luang- 
paiboon 1995). 





349 



3 Continuous Stirred Tank Reactor (CSTR) 

For the CSTR a stream rich in chemical A of feed concentration is flowing 

into a reactor at a feed flow rate of and a feed temperature of The reac- 

tion in the CSTR is an irreversible, first order exothermic reaction. The proportion 
of chemical A is converted to a desired product B, which, in turn, at high tempera- 
ture undergoes further reaction and is decomposed to form an undesired by- 
product C (Fig.3.1). The stated objective is to explore the operating conditions 
corresponding to higher concentration of product. 




Fig. 3.1. The continuous stirred tank reactor 

It is also assumed that level is perfectly controlled, so the volume of material in 
the tank is constant. This implies that the flow out equals the flow in. The tem- 
perature in the reactor may be regulated by manipulating the flow rate of the cool- 
ing water (Fc) in the heat exchanger. There are three predictor variables, which 
can be set to any chosen values within safe limits. These predictor variables relate 
to the feed flow are shown in Table 3.1. The response variable of the process is 
defined to be the concentration of the desired product B, Cb- 



Table 3.1. Predictor variables of feed flow and their safe limits 



Predictor 

variables 


Description 


Unit 


Feasible Region 


T(in) 


Feed temperature of reactant A 


Celsius 


60-100 


F(in) 


Feed flow rate of reactant A 


Liter/minute 


1-10 


CA(in) 


Concentration of reactant A 


Mole/liter 


1-15 




350 



4 Details of the Proposed Method 

Parameters: the volume of the factorial design [8]; the step length [1]; the signifi- 
cance level for tests of significance of slopes [10%]; g [-1]; fi [6.5]. 

Step 1: Perform a 2^ design at a random centre point. 

Step 2\ Fit a regression plane to the data. 

Step 3: Test whether there is evidence that either j3i, p 2 or yffs is different from zero 
at the 10% level of significance. 

Step 4a: If the result is significant, move one step along the path of steepest ascent 
and determine the yield and go to Step 5a. Otherwise go to Step 4b. 

Step 4b: Test whether there is evidence that the interaction or curvature check is 
significant. If the check is significant, go to Step 6. Otherwise, replicate the design 
and return to Step 2. 

Step 5a: If the yield is greater than the previous yield or the stochastic element 
meets the requirement of acceptance, continue by moving another step in the same 
direction. 

Step 5b: If the yield is not greater than the previous one (yo), then calculate the ob- 
jective increment (Ay) and test the element as follows: Randomly generate a ran- 
dom variable, jc, ~ Uniform (0, 1). If x < P(Ay) = EXP(yffyo^Ay) then go to Step 5a. 
Otherwise return to the preceding point then carry out another 2^ design and return 
to Step 2. If the first step leads to a yield less than the yields obtained in the pre- 
ceding 2^ designs then replicate the design and go to Step 2. 

Step 6: Implement the finishing strategy. This is a central composite design (CCD) 
centred on the point (T(in)p, F(in)p, CA(in)p), and fit then a quadratic surface to find 
the maximum (T(i„)p, F(in)p, CA(in)p). If (T(in)p, F(i„)p, CA(m)p) is within the volume of 
the designs, then (T(in>p, F(in>p, CA(in)p) is taken as the optimum operating condition. 
If (T(in)p, F(in)p, CA(in)p) is not within this volume, another CCD is carried out, cen- 
tered on the point from the first CCD with greatest yield. A quadratic surface is 
now fitted to all the data. If the maximum is outside the volume of the union of the 
two containing cubes, the ridge is searched for the greatest value of the function, 
using a step length of 0.05 (from additional experiments by using fewer runs). 



5 Experimental Results and Discussion 

The comparison is made with the measurement noise on the concentration of the 
desired product B (normal and independent with zero mean and standard deviation 
of 0.5, 1, 2 and 3). The typical three-dimensional response surfaces, with 
fixed at 1 and 15, are shown in Fig 5.1. There are four performance measures over 
100 realizations in this study. The first and second measures are an average and a 
standard deviation of greatest actual concentration of the desired product B from 
the finishing strategy respectively. The third is an average number of runs until the 
algorithms converge. Finally it is the percentage of sequences ended at the opti- 
mum. 




351 




Fig. 5.1. The surface plot with CA(in) fixed at 1 and 15 respectively 

The process settings for all the scenarios are given in Table 5.1. The perform- 
ance of the method of steepest ascent and the integrated approach can be ex- 
plained by the box plots in Fig. 5.2 when the error standard deviation was 2.0 and 
3.0. Note that since the efficiency of these algorithms is related to their initial 
points, it would be helpful to set random starting points for all algorithms. These 
results show that the performance of the integrated approach under the stochastic 
element of Simulated Annealing seems superior to the algorithm based on the 
method of steepest ascent at the higher levels of error standard deviations. 



Table 5.1. Four achievements over 100 realisations 



Stdev 

of 

Noise 


Algorithm 


Average 

Greatest 

Actual 

Concentration 


Stdev. of 

Greatest 

Actual 

Concentration 


Average 
Number 
of Runs 


Percentage 
(ending at 
optimum) 


0.5 


Steepest Ascent 


59.2011 


7.6823 


33.3 


0.85 




Integrated 


56.7053 


10.2178 


38.17 


0.85 


1 


Steepest Ascent 


57.4528 


7.3680 


33.75 


0.90 




Integrated 


57.0683 


6.8899 


34.9 


0.90 


2 


Steepest Ascent 


59.3067 


9.5568 


32.6 


0.85 




Integrated 


61.1069 


7.7166 


34.75 


0.90 


3 


Steepest Ascent 


60.0803 


7.7723 


31.35 


0.80 




Integrated 


61.2676 


6.3431 


33.3 


0.95 






^ 

56 - I 1 

50 - I 

46 ~ 

40 -I , , 

Steepest Ascent Integrated Approach 



75 ■ 
65 - 



45 -* i 1 

Steepest Ascent integrated Approach 



Fig. 5.2. Two independent box plot comparisons showing the performance of the method of 
steepest ascent and the integrated approach when the error standard deviation was 2.0 and 
3.0 respectively. 

Moreover, percentage of sequences ended at the optimum or near optimum of 
radius equalling two from the integrated approach is better at higher levels of error 
standard deviation although the greater number of runs were required to converge 






352 



to the optimum. As stated earlier, the function in this research was restricted to 
three predictor variables. Consequently, comparisons and conclusions between the 
two algorithms may not be valid for other families of functions Other stochastic 
approaches could be extended to the method based on conventional factorial de- 
signs to increase its performance, especially in terms of speed of convergence, 
when the error standard deviation is at higher levels. 



6 Acknowledgements 

The author wishes to thank the Faculty of Engineering, Thammasat University, 
THAILAND for the financial support. I gratefully acknowledge the computing as- 
sistance of Pawabutra A and Kansompod S in the early phase of this research. 



References 

Bohachevsky 10, Johnson ME, Stein ML (1986) Generalized simulated annealing for func- 
tion optimization. Technometrics: 209-217 

Goldberg DE (1989) Genetic algorithms in search optimization and machine learning. Ad- 
dison- Wesley 

Holland JH (1975) Adaptation in natural and artificial systems. Ann Arbor, The University 
of Michigan Press 

Jennison C, Franconi L, Sheehan N (1995) Stochastic optimization: simulated annealing 
and the genetic algorithm. Institute of Mathematics and its Applications Conference 
Series 54: 209-213 

Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 
220: 671-680 

Luangpaiboon P (1995) Dynamic process layout planning. Published Master of Engineering 
Thesis, Kasetsart Diversity, Thailand 

Luangpaiboon P (2000) A comparison of algorithms for automatic process optimization. 
Published Doctor of Philosophy Dissertation, University of Newcastle upon Tyne, UK 

Luangpaiboon P (2001) Proposed finishing strategies based on experimental designs for 
process optimization. Thammasat International Journal of Science and Technology: 
39-45 

Luangpaiboon P, Metcalfe AV, Rowlands RJ, Tham MT, Willis MJ (2000) Comparison of 
a modified simplex and a genetic algorithm for optimizing a chemical process. Pro- 
ceedings of the U* International Conference on Industrial Statistics in Action 2000, 
Newcastle upon Tyne, UK 

Myers RH, Montgomery DC (1995) Response surface methodology: process and product 
optimization using designed experiments. John Wiley & Sons, Inc. 





Optimization on Directionally Convex Sets 



Vladimir Naidenko 

Institute of Mathematics, National Academy of Sciences of Belarus 
11 Surganov str., Minsk, 220072, Belarus 
E-mail: naidenko® im.bas-net. by 



Abstract. Directional convexity generalizes the concept of classical convexity. We 
investigate OC-convexity generated by the intersections of C-semispaces that effi- 
ciently approximates directional convexity. We consider the following optimization 
problem in case of the direction set of OC-convexity being infinite. Given a com- 
pact OC-convex set A, maximize a linear form L subject to A. We prove that there 
exists an OC-extreme solution of the problem. A Krein-Milman type theorem has 
been proved for OC-convexity. We show that the OC-convex hull of a finite point 
set represents the union of a finite set of polytopes in case of the direction set being 
finite. 



1 INTRODUCTION 

In the paper, we consider directional convexity that generalizes the concept 
of classical convexity. It arises in many geometrical problems such as image 
processing, databases, VLSI design, etc. [1-4]. 

Let O be a set of vectors in K^. We assume that the set O is symmetric, 
i.e., O = — O, which is not a restriction in the context of directional convexity. 
For a, 6 G i?’^, we write [a, b] for the segment with endpoints a, 6, i.e., [a, b] = 
{aa + (1 - a)b | a G [0, 1]}. 

Definition 1. A set A C is called a 0-convex if for any two points 
xi , X 2 G A such that the segment [x\ , X 2 ] is parallel to some nonzero vector 
of O, we have [xi,X 2 ] C A (0-convexity is also called directional convexity 
or restricted-orientation convexity in the literature). 

The main object of our investigation is a suitable notion of an “0-convex 
hull” of a set. One can define the 0-convex hull of a set A as the intersection 
of all 0-convex sets (according to definition 1) containing A; this 0-convex 
hull will be denoted by conv°[Aj. 

We shall concentrate on another kind of 0-convex hull, namely one defined 
by means of 0-semispaces. By a convex cone we mean a set C such that for 
any x, 2 / G O it contains XxAjiy, A > 0, /i > 0. A convex cone C is called acute 
if C does not contain any one-dimensional linear subspace. A set M C O is 
called a half-set if M U — M = 0 and M fi - M = 0. By CH[X] we denote the 
convex conic hull of a set X C i.e., the smallest convex cone containing 
the set X. 




354 



Definition 2. Let C be an acute cone that is maximal among all acute 
cones containing O. Then the set = R^\ {a + C) is called a C- semispace 
of directional convexity in a G 

Let us introduce the notion of OC-convexity. The class OC of subsets 
from yields the OC-convexity if OC contains only the following sets: 1) 
2) all C-semispaces, 3) the intersections of arbitrary C-semispaces. 

Definition 3. Let A C R^. The set conv^^[A], called the OC-convex hull of 
A, is defined as the intersection of all C-semispaces containing A. 

It is easy to check that the 0-convex hull is always contained in the OC- 
convex hull; also, if O contains all vectors (i.e., for the usual convexity), both 
these hulls of A coincide. Hence the OC-convexity is an approximation of 
the 0-convexity. Many problems, difficult in case of the 0-convexity, become 
easy in case of the OC-convexity. A point a G A is called OC-extreme for A, 
if A \ {a} is OC-convex. The set of all OC-extreme points for A is denoted 
by ext^^[A]. Then the OC-convex hull conv^^[A] can be determined through 
the OC-extreme points as follows. For any x e R'^ \ A: x e conv^^[A] iff 
X ^ ext^^[AU{x}]. By {x, y) we denote the scalar product of vectors x, 2 / G RP. 

2 MAIN RESULTS 

We shall use the following properties of OC-convexity [2-4]. 

• All classically convex sets are OC-convex. 

• Every OC-semispace at an arbitrary point x E R^ represents a C-semi- 
space 0-convexity at the same point x. 

• Let M C O be a half-set. If CH[M] is an acute convex cone then R^ \ 
(CH[Mj -h x) is an OC-semispace at an arbitrary point x. 

• For every OC-semispace 5^^ = R^\{C-^x), there exists a half-set M C C 
such that R^ \ (CH[Mj + x) = Sf . 

The following theorem holds. 

Theorem 1. Let R^ \ {C + a) is an OC-semispace at an arbitrary point 
a. Then, for any OC-convex compact A, there exists a point x E A such that 
the OC-semispace R^\{C -P a) includes the whole set A except the point x. 

Proof The theorem is proved by induction on the dimension of linear space 
R^. The theorem is obvious for R^. Suppose that it holds for R^~^. Let us 
prove the theorem for R^. Let H = {x E R^ \ a • x = 0} he 3. hyperplane 
of support to C, and let the halfspace = {x E R^ \ a • x > 0} include 

the whole set C. Let us consider the following optimization problem in the 
variable x E R^: 





355 



(a, rc)— >max . . 

X € A 

Since the set A is compact, the function (a,x) of the problem (1) attains 
the maximum value at some point xq £ A hy the Weierstrass theorem. We 
denote the hyperplane {x e \ a • x = a • xq} hy Hq. If H H O = then 
any point x e Hq DA obviously proves the theorem in the case. Suppose that 
Hf)0 ^ 0. The hyperplane Hq can be described as an affine (n-l)-dimensional 
subspace. Let us associate a linear space (n-l)-dimensional space with the 
affine subspace as follows. Fix a point ho in Hq as the origin of coordinates. 
Then every point x e Hq corresponds to a vector v = hox. Let O' be a set of 
vectors in Hq, such that each vector of O' is collinear to some vector of H DO 
and vice versa. In the space Hq, the point set i^o \ (O + ho) represents an 
00-semispace at the point ho for the direction set O' of directional convexity. 
Since the space Ho can be identified with there exists a point ao G 

ifoHA such that {Ho\{C + ao))r\{Hor\A) = {Hof)A)\{ao} hy the induction 
hypothesis. Hence {C ao) C\ HoH A = {ao}. Note that (O -h ao) fl A C Ho- 
Therefore (O + ao) H A = {ao}, i.e. (i?^ \ (O + ao)) fl A = A \ {ao}. Thus, 
the 00-semispace R^\{C + ao) in the space R^ proves the theorem in the 
case. This concludes the proof. 

Let A be a compact 00-convex set in i?^, and O be infinite. Let us 
consider the following optimization problem 

(a, x) max . . 

X € A, 

where a is fixed and x is a variable vector. 

As shown in [5], in case of O being finite, there exists at least one solution 
of (2), which is an 00-extreme point of A. We shall show that an analogous 
statement holds in the infinite case as well. 

The following theorem holds. 

Theorem 2. There exists an OC -extreme point of A, which is a solution to 
problem (2). 

Proof. Let us consider the hyperplane H = {x e R^ \ a • x = 0} and the 
halfspace = {x e R^ \ a- x >0} bounded by H. Note that there exists a 
classically convex semispace So at the point 0 , such that iif is a hyperplane of 
support to So and R^ \ So C according to the properties of semispaces 
of classical convexity. Either v E R^ \ So ot -v e R^ \ So for any nonzero 
vector V E R^ (but not simultaneously). Then R^ \ So includes some vector 
half-set M of O. Since the cone R^ \ So is acute, the cone CH[M] is acute 
as well. Hence R^ \ CH[M] represents an OC-semispace at the point 0 . Note 
that if is a hyperplane of support to CH[M], and the halfspace includes 
CH[M]. Thus, using the proof of theorem 1, one can find an OC-extreme 
point for A, which is a solution to problem (2). This concludes the proof. 





356 



For the OC-convex sets we have proved a Krein-Millman type theorem. 

Theorem 3. Let 0 be finite. If A C is a compact set then conv^^[A] = 
conv^^[ext^^[A]]. 

Proof. Suppose there exists a compact set A C such that conv^^[A] ^ 
conv^^[ext^^[A]]. Then there exists a point x e A such that the OC -convex 
hull conv^^[ext^^[A]] does not contain x. Hence x G ext^^[ext^^[A] U {x}]. 
Moreover, one can find an OC-semispace = R^ \ {C + x) such that 
ext^^[A] C 5^^, i.e. (C + x) fl ext^^[A] = 0. Since O is finite, {C + x) is 
a closed polyhedral set with the corner point x. Hence, for some constants 
a E R^ and a E Rj there exists a closed halfspace H = {v E R^ \ (a, v) > a} 
such that {C x) C H and {u G | (a,u) > a} D (C + x) = {x}. In other 
words, there exists a hyperplane {v E \ (a, v) = a} which intersects with 
{C 4- x) only at the corner point x. Let us consider the following problem: 

(a, i;)->max 

u G (C + x) n A ^ ^ 

Since (C + x) is closed, the set (C + x) fiA is compact. Hence the function 
(a, v) attains the maximum value on (C + x) fl A by the Weierstrass theorem. 
Therefore, problem (3) has at least one solution. Fix a point v* such that 
{a,v*) is the maximum of (a,u). Consider the set {C Av*). Since H contains 
the whole set C -f x, the halfspace H* = {v E R^ \ {a,v) > (a, u*)} contains 
the whole set {C + v*). On the other hand, {u G i?” | (a, u) > (a,u*)} fl 
{C 4- V*) — {u*}. Since v E {C A x) H A implies (a,u) < {a,v*), then ((C -h 
x) n A) n (C + u*) = {u*}. The statement A fl (C H- u*) = {v*} follows 
from {C A V*) C (C 4- x), i.e. v* E ext^^[A]. It contradicts the conjecture 
(C 4- x) n ext°^[A] = 0. This concludes the proof. 

In what follows, we shall show that the OC-convex hull conv^^[A] repre- 
sents the union of a finite set of polytopes in case of both the sets A and 
O being finite. Let us derive a formula describing conv^^[A] in that case. 
Let A = {ai, ...,ajfe}. We set an enumeration of OC-semispaces R^ \ {Ci A 
a), ..., \ {Cm A a) at an arbitrary point a E R^\ A. Note that the point a 

belongs to conv°^[A] iff a ^ ext^^[{ai, ...,ak} U {a}]. This means that 

(Ci4-a)nA/0 (VzGM^) (4) 

Any polyhedral set {Ci -h a) can be described in terms of all solutions 
to the system of linear equations {x G | A^^^x > A^*^a}, where A^^^ is 
a matrix corresponding to Ci. Then, using relation (4), one can derive the 
following formula: 



m k 

n(U(» e i?” I > A^^a}) (5) 

i=l j=l 





357 



Since any classically convex set is OC-convex, the classically convex hull 
conv[{ai, ,..,afc}] includes conv^^[{ai, ..., 0 ^}]. Moreover, since the classically 
convex hull of any finite point set is bounded, conv^"^[{ai, ...,ajfe}] is bounded 
as well. Using formula (5), we obtain that conv^‘^[{ai, ...,ajfc}] represents the 
union of a finite set of poly topes. 

We shall consider the following problem of finding ext^^[A]. Given a finite 
A, n and O, does an arbitrary point a of belong to ext^^[A]? The following 
theorem holds. 

Theorem 4. The decision problem whether or not an arbitrary point belongs 
to the OC -extreme point set, is NP-complete. 

p 

Proof. We shall show how to reduce SAT to the problem. Let F = A Ci he 

i=l 

a CNF where every C{ is a disjunction of literals from ...,Xn,Xn}- 

Suppose that O — {ci, — ci, ...,Cn, — Cn} where the vectors ei,...,en yield a 
basis of R^. Let Ahe {oi\i ^ 0,p} where uq is the origin of coordinates, a{ = 
0 : 1 ^^ Cl + . . . H- an^ Cn , Vz G 1 , p. The values , . . . , a^n is determined as follows: 
if Xj lies in Ci then = -1; if —Xj lies in Ci then = 1; if both the 

literals Xj and —Xj don’t lie in Ci then — 0. We shall show that the point 
ao belongs to ext^^[^] iff the CNF is satisfied. Every valuation of xi,...,Xn 
bijectively corresponds to some set M(a:i, ..., 2 :^) = {e\{xi), ...,en{xn)} where 
ej{xj) — €j if Xj = 1, and ej{xj) = —Cj if Xj — —1 for any j G l,n. Note that 
any (7-semispace at ao for the set O can be determined as an appropriate 
set R^ \ CR[M{xi, ...,Xn)] with the values of xi,...,Xn fixed in some way. 
Suppose we can find values of xi,...,Xn such that Oi ^ CJl[M{xi, ...,Xn)]: 
where z > 1. Then there exists ^ 0 such that — 1 if ej{xj) = 

6j, and = 1 if €j{xj) = —ej. Hence Ci is satisfied with the values of 
xi,...,Xn- In other words, if there exists a valuation of x\,...,Xn such that 

p 

^\{^o} C i?^\CH[M(xi, ...,Xn)], then F — A is satisfied with the values 

i—l 

of x\, ...,Xn. The relation A \ {ao} ^ R^\ CH[M(xi, ...,Xn)] is equivalent to 
ao G ext^^[A]. This concludes the polynomial-time reduction of SAT to the 
problem of finding ao G ext^^[A]. Now we shall show that the problem lies 
in NP. Note that the relation ao G ext^^[A] is equivalent to {3M C 0){\/y G 
^ \ {^^o})[p G i?” \ CH[M], MU-M = 0, Mn-M = can be checked 
in polynomial time for any M C O. This concludes the proof. 

Corollary 1. It is NP-hard to verify the membership of an arbitrary point 
in the OC -convex hull of a finite point set. 

References 

1 . Wood, D. (1985) Computational Geometry /ed. G.T. Toussant. North-Holland, 
Amsterdam 





358 



2. Metelskii, N.N., Martynchik, V.N. (1996) Directional convexity. Matematich- 
eskie zametki. 60, 406-413 (in Russian) 

3. Metelskii, N.N., Naidenko, V.G. (1999) Directionally convex hulls in . Vesti 
Natsionalnoi Akademii nauk Belarusi. Seriya fiz.-mat. nauk. 4, 39-42 (in Rus- 
sian) 

4. Metelskii, N.N., Naidenko, V.G. (2000) On a class of directionally convex semis- 
paces. Vesti Natsionalnoi akademii nauk Belarusi. Seriya fiz.-mat. nauk. 1, 56-59 
(in Russian) 

5. Naidenko V.G. (2000) Directionally convex set and its application to an op- 
timization problem. Proc. Int. WorkshopDiscrete Optimization Methods in 
Scheduling and Computer-Aided Design, Minsk, Republic of Belarus, September 
5-6. Institute of Engineering Cybernetics, Minsk, 159-161. 





Meta-Heuristiken in virtuellen Lernumgebungen 



Torsten Reiners, Imke Sassen und Stefan VoB 

Technische Universitat Braunschweig, Institut fur Wirtschaftswissenschaften, Ab- 
teilung ABWL, Wirtschaftsinformatik und Informationsmanagement, Abt-Jerusa- 
lem-Str. 7, D-38106 Braunschweig, {t.reiners, i.sassen, stefan.voss}@tu-bs.de. 



Zusammenfassung. In der Lehre gewinnt der Einsatz von virtuellen Lemumge- 
bungen an Bedeutung, wobei fur das Operations Research aufgrund der Anforde- 
rungen an eine praxisorientierte Ausbildung ein erhohter Bedarf an qualitativ 
hochwertigen Lemangeboten besteht. Anhand von konkreten Leminhalten soli 
aufgezeigt werden, auf welche Weise eine Verkniipfung der theoretischen Grund- 
lagen mit praktischen Anwendungen hergestellt werden kann. Hierzu erarbeiten 
die Lemenden grundlegende Begrifflichkeiten und Zusammenhange, die in einem 
weiteren Schritt anhand von realen Problemstellungen aus einer neuen realitatsbe- 
zogenen Perspektive betrachtet und dadurch vertieft und gefestigt werden. 



Einleitung 

Angesichts einer steigenden Anzahl von Studierenden, verringerten finanziellen 
Ressourcen der Ausbildungsinstitutionen und dariiber hinaus der Forderung nach 
der Bereitstellung von Angeboten zum lebenslangen Lemen, bieten virtuelle Uni- 
versitaten weitere Moglichkeiten zur Forderung der Qualitat in der Lehre und 
werden aus diesem Grund von zahlreichen Landem in ihr Bildungskonzept inte- 
griert (siehe Hazemi et al. (1998)). In diesem Beitrag soil der Bereich Operations 
Research in Bezug auf die Darstellung und Prasentation in virtuellen Lemumge- 
bungen (VLU) am Beispiel der Meta-Heuristiken naher betrachtet werden. Da ge- 
rade im Operations Research eine praxisorientierte Lehre zur Aneigmmg von 
Kenntnissen und Fahigkeiten in der Modellierung und Losung von (mathemati- 
schen) Problemstellungen, Algorithmen, Softwareentwicklung sowie Projektma- 
nagement notwendig ist, sind die Entwicklungen der Forschung hinsichtlich neuer 
didaktischer Methoden und virtueller Lernumgebungen zu verfolgen und umzu- 
setzen. Insbesondere Meta-Heuristiken, als zeitgemaBe und wichtige Werkzeuge 
zur Optimierung, haben in der Lehre sowohl in Lehrbuchem als auch in prakti- 
schen Umsetzungen ein Schattendasein und sind nur mangelhaft vertreten, so dass 
traditionelle Prasenzveranstaltungen im Bereich OR/MS sich nur unzureichend 
mit Meta-Heuristiken auseinandersetzen. Daraus und auch aufgrund fehlender gu- 
ter Beispiele aus der Praxis ergibt sich eine unzureichende Vermittlung der we- 
sentlichen Vorteile von Meta-Heuristiken gegeniiber anderen Methoden der Opti- 
mierung, wobei hier bekannte Algorithmen wie z.B. die genetischen Algorithmen 
ausgenommen sind. 




360 



Meta-Heuristiken besitzen in vielerlei Hinsicht Analogien zu weiteren Gebieten 
des Operations Research. Beispielsweise operiert der Simplex- Algorithmus (siehe 
u.a. Domschke und Drexl (2002)) zur Losung von linearen Optimierimgsproble- 
men durch das gezielte Traversieren eines den Losungsraum beschreibenden Po- 
lyeders. Analog gehen z.B. Verfahren zur Losung von Transportproblemen vor, 
bei denen im Anschluss an eine Eroffnungsheuristik zur Ermittlung einer zulassi- 
gen Losung Verbesserungsverfahren wie die MODI-Methode (siehe z.B. 
Domschke (1995)) auf stark strukturierten Losungsraumen zum Einsatz kommen. 
Fur die Studierenden ergeben sich durch die Behandlung von Meta-Heuristiken 
aufgrund von Analogien der Prinzipien zur Losung von Optimierungsverfahren 
aus fachdidaktischer Sicht weitreichende Transfer- und Anwendungsmoglichkei- 
ten. Daher bietet sich eine inhaltliche Behandlung der lokalen Suche sowie Meta- 
Heuristiken im Rahmen der Lehre an. In diesem Beitrag stellen wir einen einfuh- 
renden Uberblick zu unserem Konzept vor, fiir eine ausfuhrlichere Darstellung 
siehe Reiners und Vofi (2002). 

Neben dem einfachen Angebot von Lehrmaterialien ohne eine weitere medien- 
didaktische Aufbereitung existieren bereits Lemumgebungen im Internet, die wei- 
terfuhrende Prasentationsformen als Hypertexte verwenden. Ein Beispiel fur Off- 
line-Software ist OR Welt (2002); frei nutzbare Lemumgebungen sind tutOR von 
Sniedovich und Byrne (2002) und darauf basierend tutORial von der IFORS 
(2002). Kommerzielle Systeme, im Wesentlichen zur Unterstiitzung der Lehre und 
der Verwaltung von Kursmaterialien, sind u.a. WebCT (2002). In einigen Landem 
wie den USA oder Australien ist es auch liblich, anstelle web-basierter Lemumge- 
bungen urheberrechtlich geschiitzte Software wie Microsoft Excel im Unterricht 
zu verwenden; siehe z.B. Bell (2000). Gemein ist den Ansatzen, dass das Angebot 
im Bereich Meta-Heuristiken zurzeit hochstens auf statische untereinander ver- 
linkte Texte oder auf unstmkturierte Sammlungen interaktiver Applets ohne wei- 
tere theoretische Ausfuhrungen begrenzt ist. Weitere Darstellungsformen von Op- 
timiemngsergebnissen sind z.B. bei Jones (1995) zu finden. 

Dariiber hinaus fehlt in aktuellen Ansatzen eine ausreichende Integration von 
Problemstellungen aus der Praxis unter Einbindung interaktiver Darstellungsmog- 
lichkeiten; siehe Wolsey (1979) oder Reisman (1997): “This profession currently 
has more algorithms than applications”. 



Aufbau der virtuellen Lernumgebung 

Fiir die interne Representation eines Kurses iiber Meta-Heuristiken soil ein wei- 
terfiihrendes Konzept fiir eine VLU verwendet werden; insbesondere die Untertei- 
lung in semantische Module unterschiedlicher GroBe soil eine dynamische Dar- 
stellung beziiglich der Konfiguration durch Studierende ermoglichen, um Be- 
diirfnisse der Lemenden zu erfiillen und dadurch insgesamt eine Motivations- 
steigemng zu erzielen. Der Einsatz weiterer Komponenten soil der Schwierigkeit 
Rechnung tragen, im Bereich Operations Research dauerhaft Interesse und Moti- 
vation der Lemenden zu erzielen und die Teilnahme an web-basierten Kursen zu 
steigem, wobei keine Unterscheidung zwischen der Art der Lemenden (z.B. „Di- 
stance-Lemer“ oder „Lifelong-Lemer“) gemacht werden soli. Hierzu erfolgt die 
Einbindung einer relationalen Datenbank zur Verlinkung der Lemobjekte unter 





361 



Ausnutzung der doit gespeicherten Meta-Informationen. Dies erlaubt eine Adapti- 
on an Studierende und ihre Bedurfnisse hinsichtlich Lemgewohnheiten sowie den 
Einbezug von verschiedenen Kommunikationsformen. Eine detaillierte Beschrei- 
bung der konzipierten Architektur ist in Reiners et al. (2002a), ein Aufbau der 
Lehrmaterialien in Reiners et al. (2002b) gegeben. 

Das dargestellte Lehrangebot besitzt den Anspruch, dass Lemende ohne Vor- 
kenntnisse spater eigenstandig Meta-Heuristiken zum Losen komplexer prakti- 
scher Problemstellungen einsetzen konnen. Hierzu werden grundlegende Prinzi- 
pien in einer vorgelagerten Kurseinheit durch interaktive Beispiele vermittelt. Eine 
Meta-Heuristik ist definiert als ein iterativer Generationsprozess, der eine unterge- 
ordnete Heuristik steuert und eine Methode darstellt, ein Optimierungsproblem in 
adaquater Zeit zu losen. Jedes Optimierungsproblem besitzt einen speziellen Lo- 
sungsraum, der hinsichtlich der Restriktionen alle zulassigen (und ggf. dariiber hi- 
nausgehende nicht zulassige Losungen, wie sie sich z.B. vermoge geeigneter 
Problemrelaxationen ergeben) enthalt. Der Losungsraum ist jedoch in den meisten 
FMlen zu groB, um die optimale Losung durch eine vollstandige Enumeration zu 
finden bzw. eine derartige Vorgehensweise in der vorgegebenen Zeit zu realisie- 
ren. Daher grenzen Meta-Heuristiken durch die intelligente Kombination ver- 
schiedener Methoden den Losungsraum auf bestimmte Bereiche ein und verwen- 
den Lemstrategien, um Informationen zu strukturieren und effektiv optimale oder 
gute Losungen zu finden, siehe VoB et al. (1999). 

Die grundlegenden Komponenten von Meta-Heuristiken werden zur Erkl^ng 
von speziellen Algorithmen wie Steepest Descent, Simulated Annealing oder Tabu 
Search genutzt. Der Lemende wird hierbei durch einfache interaktive Beispiele 
unterstiitzt, welche die zugmnde liegenden Prinzipien veranschaulichen und das 
Verhalten der verschiedenen Methoden verdeutlichen. Abbildung 1 zeigt die 
Schritte einer interaktiven Animation zur Verdeutlichung einer Nachbarschaft. Die 
Interaktion fiir die Studierenden liegt z.B. darin, dass der nachste „beste“ Nachbar 
gewahlt werden muss, wobei eine Bewertung durch die VLU erfolgt. Uber die Be- 
riicksichtigung von Analogien lassen sich hier wiedemm Kenntnisse z.B. uber den 
Simplex- Algorithmus einbinden (sowie umgekehrt die Vorgehensweise des Sim- 
plex-Algorithmus als „einfache“ lokale Suche auf einem gut stmkturierten Lo- 
sungsraum erlautem). Betrachtet man die Vorgehensweise des Bestimmens einer 
ersten zulassigen Losung und die daraufhin folgende Anwendung eines Verbesse- 
mngsverfahrens, wie sie im Bereich der Meta-Heuristiken i.d.R. angewendet wird, 
so besitzt dieses gleichermaBen eine Korrespondenz in der so genannten Zweipha- 
sen- Oder M-Methode. 



Starting 

SokJti<3n 




Abb. 1 Darstellung einer Nachbarschaft fur eine vorgegebene Losung 





362 



Aufbauend auf den grundlegenden Prinzipien sollten Studierende erworbenes 
Wissen in einer komplexeren und in weiten Teilen realistischen Problemstellung 
anwenden, um eine Intensivienmg der Materie zu erreichen. Hierzu konnen Prob- 
lemstellungen von Forschungsprojekten oder aus der Praxis eingesetzt werden. 
Unterstiitzt durch die virtuelle Lemumgebung entwickeln die Studierenden eine 
Losungsmoglichkeit unter der Verwendung bestehender und konfigurierbarer 
Softwarepakete. Diese Software kann entweder ein kommerzielles Produkt mit ei- 
ner graphisch aufbereiteten Benutzeroberflache zur Eingabe des Problems und mit 
Parametem fiir die Losungsalgorithmen sein, oder aber eine Softwarebibliothek, 
die z.T. wieder verwendbare Codes enthalt, aber dennoch auch einige Program- 
mierkenntnisse voraussetzt. Um eine weitgehenden Integration in unsere virtuelle 
Lemumgebung ohne Medien- oder Technologiebriiche zu gew^leisten, verwen- 
den wir HotFrame (Heuristic OpTimization FRAMEwork) von Fink und VoB 
(2002). Ein Benutzerinterface ermoglicht die Konfigurierung der wesentlichen 
Komponenten der Meta-Heuristiken. AuBerdem wird ein automatisch generierter 
Source-Code ausgegeben, mit Todo-Teilen fixr die Lemenden, die in ihrem 
Schwierigkeits- und Komplexitatsgrad variabel sind. SchlieBlich stellt die VLU 
ein Experimentierfeld zur Verfugung, in dem das erreichte Ergebnis an diversen 
Problemstellungen mit verschiedenen Parametereinstellungen der Meta- 
Heuristiken getestet werden kann. Die Resultate werden gesammelt und konnen 
von den Lemenden durch den Einsatz statistischer Methoden und Darstellungs- 
moglichkeiten eigenst^dig evaluiert werden (siehe auch Abbildung 2). 



Ausbiick 

Auch hochgradig interaktive Lemumgebungen unterstiitzen zumeist nicht die 
Idee, dass Studierende erworbenes Wissen im Rahmen eines bestimmten For- 
schungsfeldes anwenden konnen, um Fragestellungen der Praxis oder aktuellen 
Forschung zu untersuchen. Gerade die Erfahrung von Praxisrelevanz imd Anwen- 
dungsmoglichkeiten des erlemten Wissens kann jedoch hochgradig zu einer Stei- 
gemng der Motivation der Lemenden beitragen. Dariiber hinaus kann aufgrund 
der M5glichkeit der eigenstandigen Evaluation des erreichten Ergebnisses bei 
Studierenden der Anspmch entstehen, ihr Ergebnis noch weiter zu verbessem, was 
ebenfalls zu einer weiteren Auseinandersetzung mit den Lehrinhalten fiihrt. So be- 
inhaltet das Generieren und Testen eigener Ldsungen unter lempsychologischen 
Gesichtspunkten lemforderliche Momente, da durch den Wechsel von Hypothe- 
sengenerierung und -testen das entdeckende und explorierende Lemen unterstiitzt 
wird und Ldsungen beliebig oft konstmiert werden konnen. Im Vergleich zu ande- 
ren Methoden werden Selbstlemprozesse besonders gefordert, da auch eigenstan- 
dige Fehlersuche und deren Beseitigung in schwierigen Programmteilen durch die 
Mdglichkeit der Selbstevaluation motiviert werden, aber optional auch Hilfestel- 
lungen von Seiten der Lemumgebung gegeben werden konnen. Gerade bei den 
Anfangen des Transfers von erlemtem Wissen auf das Ldsen komplexer Problem- 
stellungen sollten die Lemenden sich nicht allein gelassen fuhlen, sondem im ge- 
wiinschten MaBe Unterstiitzung in Anspruch nehmen konnen. Vor allem die Er- 
fahmng, die bereitgestellte Software in richtiger Weise zum Ldsen komplexer 
Problemstellungen eingesetzt zu haben, gibt den Studierenden auf der einen Seite 





363 




Q£tonizBd 

mot? 



SpeEifk3tion| 

Ranrwak 



C> 



Gonpiled 




Ptobem 


program 


paramster 


data 



^ 



&BCUtion eTMronment 
CaientV senrer-arciiitediJre) 








Abb. 2 Anwendungssequenz einer Meta-Heuristik 



Aufschluss uber ihren Lemerfolg und versichert ihnen auf der anderen Seite auch 
den Wert des Gelemten, sowie dessen interdisziplin^en Verwendungsmoglichkei- 
ten, entgegen dem Vomrteil, dass sich mit Methoden des Operations Research nnr 
in theoretischen Anwendungen gute Ergebnisse erzielen liefien. 



Danksagung 

Diese Arbeit wurde von dem BMBF (Bundesministerium fur Bildung und For- 
schung) in dem Programm „Neue Medien in der Bildung” (08NM094D) unter- 
stutzt. 










364 



Literatur 



Bell, P.C., 2000. Teaching Business Statistics with Microsoft Excel INFORMS Transac- 
tions on Education. 1(1). http://ite.informs.org/VollNol/ bell/bell.html. 

Bjorck, U., July 2002. List of Virtual Universities, http://www.ped.gu.se/ulric/vais.html . 

Domschke, W., 1995. Logistik: Transport. 4. Auflage. Oldenbourg, Miinchen. 

Domschke, W. und A. Drexl, 2002. Einfuhrung in Operations Research. S.Auflage. Sprin- 
ger, Berlin. 

Fink, A. und S. VoB, 2002. HotFrame: A Heuristic Optimization Framework. In: S. VoB 
und D.L. Woodruff (Hrsg.). Optimization Software Class Libraries. Kluwer, Boston. 
81-154. 

Hazemi, R., S. Hailes und S. Wilbur (Hrsg.), 1998. The Digital University: Reinventing the 
Academy. Springer, London. 

IFORS, July 2002. tutORial http://www.ifors.ms.unimelb.edu.au/tutorial . 

Jones, C.V., 1995. Visualization and Optimization. Kluwer, Boston. 

ORWelt, July 2002. ORWelt. http://dsor.uni-paderbom.de/de/forschung/wbs/orwelt . 

Reisman, A., 1997. Flowshop Scheduling/Sequencing Research: A Statistical Review of the 
Literature, 1952-1994. IEEE Transactions on Engineering Management. 44, 316-329. 

Reiners, T. und S. VoB, 2002. Teaching Meta-Heuristics. Working Paper, Technische Uni- 
versitat Braunschweig. 

Reiners, T., D. ReiB und S. VoB, 2002a. Using Hyperbolic Trees and SmartBars within Vir- 
tual Learning Environment Concepts. Proceedings of the World Congress Networked 
Learning in a Global Environment, Challenges and Solutions for Virtual Education 
(NL 2002). ICSC-Naiso Academic Press, Millet Alberta [ISBN: 3-906454-31-2], 
#100029-03-TR-026, 1-7. 

Reiners, T., D. ReiB und S. VoB, 2002b. XML-basierte Kodierung von Lernobjekten: Do- 
kumentation zu VORMS. Arbeitspapier. Technische Universitat Braunschweig, in Vor- 
bereitung. 

Sniedovich, M. und A. Byrne, July 2002. tutOR. http://www.tutor.ms.unimelb.edu.au/ fra- 
me.html . 

VORMS, July 2002. Virtuelles Studienfach Operations Research/Management Science. 
http://www.vorms.org . 

VoB, S., 2001. Meta-Heuristics: The State of the Art. In: A. Nareyek (Hrsg.). Local Search 
for Planning and Scheduling. Lecture Notes in Artificial Intelligence 2148. Springer, 
Berlin, 1-23. 

VoB, S., S. Martello, I.H. Osman und C. Roucairol (Hrsg.), 1999. Meta-Heuristics: Ad- 
vances and Trends in Local Search Paradigms for Optimization. Kluwer, Boston. 

WebCT, Inc., July 2002. WebCT.com. http://www.webct.com . 

Wolsey, R.E.D., 1979. Pragmatism Triumphant or Past Sophistication and Future Ele- 
gance. In: K.B. Haley (Hrsg.). Operational Research 78. North Holland, Amsterdam. 
80-86. 





An Evolutionary Algorithm for Bayesian Network 
Triangulation 



Tomasz Lukaszewski 

Institute of Computing Science, Poznan University of Technology, ul. 
Piotrowo 3a, 60-965 Poznan, Poland, luki@man.poznan.pl 



Abstract: The problem of triangulation (decomposition) of Bayesian networks 
is considered. Triangularity of a Bayesian network is required in a general evi- 
dence propagation scheme on this network. Finding an optimal triangulation is 
NP-hard. A local search heuristic based on the idea of evolutionary algorithms is 
presented. The results obtained using existing and proposed approaches are com- 
pared on a basis of a computational experiment. 

Keywords: Bayesian networks, graph triangulation, evolutionary algorithm 



1 Introduction 

Some of the earliest Artificial Intelligence approaches to reasoning under uncer- 
tainty were based on Bayesian and decision-theoretic schemes. Both the computa- 
tional and the representational complexity of probabilistic schemes caused a long- 
lasting departure from these approaches. Only recently, the development of prob- 
abilistic graphical models, such as Bayesian networks, caused a renewed interest 
in applying probability theory in intelligent systems. Today, Bayesian networks 
(directed acyclic graphs) are successfully applied in a variety of problems, includ- 
ing machine diagnosis, user interfaces, natural language interpretation, planning, 
vision, robotics, data mining and many others. They provide a natural, efficient 
method for representing probabilistic dependencies among variables. It is enough 
to consider only the known dependencies among variables in a domain, rather than 
to assume that all variables are dependent on all the other variables. 

The most common task performed on Bayesian networks is a general evidence 
propagation - the calculation of the posterior marginal distributions of all non- 
evidence variables for a given set of evidence. This kind of reasoning is performed 
mainly by message-passing algorithms in a join tree. These messages are passed 
from leaves towards the root of the join tree {inward phase), and then backwards 
{outward phase). The best known algorithms are Lauritzen-Spiegelhalter, Hugin, 
Shafer-Shenoy [5,1,7]. They differ mainly in the construction of messages. How- 
ever, all of them are based on a join tree created for a given Bayesian network. 
Such a join tree can be created by triangulation of the (moralised) graph for a 
given Bayesian network. For a given Bayesian network many different triangu- 
lated graphs can be built. Moreover, for each triangulated graph many different 
join trees can be built. Finding an optimal join tree for a given Bayesian network 




366 



is NP-hard, however, the second stage can be done in a very efficient way [2]. Dif- 
ferent heuristic approaches are used at the first stage - Bayesian network triangu- 
lation [3]. 

The Bayesian network triangulation is also used in our approach to reasoning 
with knowledge updating in Bayesian networks [6]. Therefore, we developed a lo- 
cal search heuristic based on the idea of evolutionary algorithms to solve this 
problem. We started with examining our algorithm for a general evidence propa- 
gation problem. We compared the efficiency of some heuristic approaches known 
from the literature with Ae proposed evolutionary algorithm for different number 
of graph vertices and different graph densities. The results of the computational 
experiment are presented in this paper. 



2 Problem formulation 

Triangularity (chordality) of a Bayesian network is required in the general evi- 
dence propagation scheme on this network. Depending on the problem to be 
solved by graph triangulation, optimality may be defined differently. The typical 
optimality criteria are the minimum fill criterion, the minimum size criterion, the 
minimum weight criterion [3]. Since our interest in graph triangulation originates 
from the general evidence propagation algorithm, we want to obtain a Bayesian 
network triangulation with small probability tables. Therefore, our objective is to 
obtain triangulations of minimum weight. Let |F]| = n.. The base 2 logarithm of the 
number of states (state size) of a vertex V. is the weight of V. (denoted w{V)). Let 
C, = {Fj,...,KJ represent a clique of a triangulated graph The weight of C. is de- 

fined as follows: 

H<C,)= (2.1) 

Vi€Ci 

and the weight of is defined as follows: 

H<Gr) = log2 =>og2 X ri”'- ^2.2) 

Q Q Vieq 

Note that if all vertices have equal weights the minimum size criterion and the 
minimum weight criterion are identical. The size of C is 5'(C) = k and the size of 
Gj, is the sum of the sizes of all of its cliques. 

The basic technique applied to triangulate a graph G (that represents a given 
Bayesian network) is to add the extra edges T produced by eliminating the vertices 
of G one by one. A vertex V. is eliminated in the following way: 

1. Adding edges such that the vertices adjacent to V. are pairwise adjacent. 

2. Deleting V. and its incident edges. 

This technique is not guaranteed to produce optimal triangulations (in terms of 
the weight of the triangulated graph) when the vertices are selected at random. 
However, the number of elimination sequences grows exponentially with the 
number of vertices making the enumerative approach inefficient. Thus it is justi- 
fied to apply a more sophisticated elimination technique. 

Several heuristic algorithms were suggested and compared for establishing 
elimination orderings in [3]. The tests were based on two “real world” graphs (43 





367 



and 56 vertices, state sizes of vertices range from 3 to 21) and two “artificial” ones 
(50 vertices, state sizes of vertices range from 2 to 5). The goal function was to 
minimise the weight of the triangulated graph. The best average results were pro- 
duced by the minimum weight heuristic. A simulated annealing metaheuristic was 
also applied. For the real and artificial graphs, simulated annealing performed bet- 
ter than the minimum weight heuristic. 



3 Evolutionary algorithm 

The proposed evolutionary algorithm is based on principles of evolution (selection 
and recombination of individuals). However, we implemented three different ver- 
sions of the algorithm: 

1. Crossover removes parents from the population when children are generated. 
Mutation is possible for all the individuals of the population. 

2. Crossover adds children to the population - parents are not removed, the popu- 
lation size is extended. Mutation is possible for all the individuals of the ex- 
tended population. The (population) extended size is reduced to the initial size 
during the selection phase. 

3. Crossover adds chili*en to the population - parents are not removed, the popu- 
lation size is extended. Mutation is possible only for the new generated indi- 
viduals. The (population) extended size is reduced to the initial size during the 
selection phase. 

We used tournament selection method (constant tournament size equal 3). Dif- 
ferent crossover (PMX, CX, OXla, OXlb, OXlc, 0X2, POS) and mutation op- 
erators (ISM, EM, DM, IVM) were implemented. These operators are described in 
[4]. The crossover and mutation rates changed from 10% to 100% with the 10% 
step. The number of iterations changed from 1000 to 2000 depending on the num- 
ber of graph vertices. 



4 Description of experiments 

The experiments were carried out for randomly generated graphs - from 20 to 100 
vertices, two states of each vertex and different densities - Table 1. The minimal 
graph density when a directed acyclic graph G is connected (the minimal relative 
number of edges) is the following: 

k = — * 1 00[%] = - * 1 00[%] 

{rp--n)H n 



(4.1) 





368 



Table 1. Specification of graph parameters used in the experiment 



n 


k 

m 


d 

[%] 


20 


10 


15,20, 25, 30 


33 


6 


12, 18, 24, 30 


50 


4 


8, 12, 16, 20, 24 


80 


2,5 


5,10,15,20, 25, 30 


100 


2 


4, 6, 8,10,12,14 



n the number of graph vertices, d densities of graphs. 

The objective function was to minimise the sum of sizes of probability tables of 
triangulated graph (the modified minimum weight criterion): 

Q ^/eC, 

The following heuristics were implemented: the minimum size heuristic (equal 
weights of vertices), the lexicographic search, the maximum cardinality search [3]. 

For each graph configuration of parameters the following steps were carried 
out: 



1. All heuristics were run n times (each vertex was eliminated as the first one, the 
best result for each heuristic was accepted). 

2. The evolutionary algorithm was run 2-times for each configuration of parame- 
ters (the best result for each configuration was accepted). 

3. The best result, obtained either by the heuristics or by any configuration of pa- 
rameters of the evolutionary algorithm, was treated as a reference point. Then, 
we computed the relative deviation (RD) between this reference point and re- 
sults obtained by the heuristics and the evolutionary algorithm in the following 
way: 



RD = 



v-min{H,EA) ^ 
min(//, EA) 



100 [%] 



(4.3) 



where: 

- 7^ is the relative deviation, 

- mm{HyEA) is the value of the reference point, 

- V is a result obtained by heuristics or a configuration of parameters of the evo- 
lutionary algorithm. 



For each graph configuration of parameters about 20 graphs were randomly 
generated in order to obtain the average values of RD for all heuristics and all con- 
figurations of parameters of the evolutionary algorithm. For each number of graph 
vertices the one density was chosen for which average values of RD were the larg- 
est. For this density d* about 100 graphs were generated in order to obtain a better 
assessment of the average values of RD. 

Experiments were conducted on 7 machines (with the Intel Pentium4-l,5GHz 
processor, 256 MB RAM memory and Windows 2000 operating system. 






369 



5 Results 

For each version of the evolutionary algorithm there are 2800 configurations of 
parameters. In order to reduce the computational time we were rejecting some 
configurations of parameters after experiments for a given number of graph verti- 
ces. Therefore, experiments for the biggest graphs were carried out only for the 
best crossover operator, mutation operator and recombination rates. Moreover, at 
the beginning we rejected the third version of the evolutionary algorithm, whose 
results were worse than those of the other versions. Nevertheless, experiments 
took 55 days (maximum 7 machines were used). Performing these experiments 
only on a single machine would take 173 days. 

Almost all the best results were obtained for the second version of the evolu- 
tionary algorithm. For small graphs (20 and 33 vertices) the best results were ob- 
tained for the ISM and EM mutation operators, mutation rate 50-100%; crossover 
operators and their rates were not so significant. For medium graphs (50 vertices) 
the best results were obtained for the ISM mutation operator, mutation rate 80- 
100%, crossover operators CX, 0X2, OXlc, POS, PMX, crossover rate 10-50%. 
For the biggest graphs (80 and 100 vertices) the best results were obtained for the 
ISM mutation operator, mutation rate 100%, crossover operator 0X2, crossover 
rate 40%. This configuration of parameters was chosen as the best one - EA *. 

Let min(/^^) represent the minimal relative deviation obtained by heuristics 
for a given graph configuration of parameters; represents the relative devia- 
tion obtained by the best configuration of parameters of the evolutionary algo- 
rithm for a given graph configuration of parameters. Below we present the com- 
parison of average values of these values (for density) - Table 2. 



Table 2. Comparison of average values of RD (for density) 



n 


d* 

[%] 


ave 

min(/?D,^) 

[%] 


ave 

RD,,. 

[%] 


ave 

[s] 


ave 

[s] 


20 


20 (= 2k) 


8 


0 


<1 


2 


33 


18 (= 3k) 


78 


1,00 


<1 


5 


50 


16 (= 4k) 


374 


5,14 


<1 


25 


80 


10 (= 4k) 


2685 


41,76 


<1 


108 


100 




7768 


40,74 


<1 


240 



n the number of graph vertices, the computational time of all heuristics, 

/^^*the computational time of the evolutionary algorithm for the best configuration of pa- 
rameters. 



6 Conclusions 

As we had expected, results obtained by the evolutionary algorithm are much 
better than results obtained by the heuristics. However, if an on-line triangulation 
is required the evolutionary algorithm can be too slow. 

Secondly, such an extensive computational experiment showed which configu- 
ration of parameters of the evolutionary algorithm are the best ones: parents are 
not removed by children during the crossover phase and mutation operator is ap- 





370 



plied to all individuals (second version of the evolutionary algorithm) results are 
significantly better than in two other approaches; ISM mutation operator with the 
rate 100%; crossover operator 0X2 with the rate 40%. 

Analysis of the obtained results was possible to make in a reasonable time us- 
ing graphical representations of these numbers. Using several PC’s reduces the 
computational time of the experiment. The computational time of such experi- 
ments can be very limited performing them in distributed architectures. 

The very high rate of the ISM mutation operator was the reason that we tested 
the evolutionary algorithm for two benchmark graphs [3]. The evolutionary algo- 
rithm was able to achieve the optimal results. Concluding, we suggest improving 
crossover operators. 

Results of this experiment are the basis for our fiirther research on the reason- 
ing in Bayesian networks with knowledge updating. 



References 

[1] Andersen SK, Jensen FV, Olesen KG (1990) An algebra of Bayesian Belief Universes 

for Knowledge-Based Systems. Networks 20: 637-659 

[2] Jensen FV, Jensen F (1994) Optimal Junction Trees. Proceedings of the Tenth Confer- 

ence on Uncertainty in Artificial Intelligence. Morgan Kaufmann Publishers, San 
Francisco, pp 360-366 

[3] Kjaerulff U (1990) Triangulation of Graphs - Algorithms Giving Small Total State 
Space. Research Report R90-09, Department of Computer Science, Aalborg Univer- 
sity, Denmark 

[4] Larranaga P, Kuijpers C, Poza M, Murga R (1997) Decomposing Bayesian networks: 

triangulation of the moral graph with genetic algorithms. Statistics and Computing 7: 
19-34 

[5] Lauritzen SL, Spiegelhalter DJ (1988) Local computations with probabilities on graphi- 

cal structure and their application to expert systems (with discussion). Journal of Royal 
Statistical Society, Series B 50: 157-224 

[6] Lukaszewski T (2001) Knowledge Updating in Bayesian Networks. Procedings of the 

14th International Conference on Systems Science, Oficyna Wydawnicza Politechniki 
Wroclawskiej, Wroclaw, pp 274-281 

[7] Shenoy P, Shafer G (1990) Axioms for probability and belief-function propagation. In 

Shachter RD, Levitt TS, Lemmer JF, Kanal LN (eds) Uncertainty in Artificial Intelli- 
gence 4: 169-198 





Approximation Algorithms for the k-center 
Problem: An Experimental Evaluation 



Jurij Mihelic^ and Borut Robic^ 

Faculty of Computer and Information Science 

University of Ljubljana, Trzaska 25, 1000 Ljubljana, Slovenia 

{jurij. mihelic, borut. robic}@fri.uni-lj. si 



Abstract. In this paper we deal with the vertex A:-center problem, a problem 
which is a part of the discrete location theory. Informally, given a set of cities, with 
intercity distances specified, one has to pick k cities and build warehouses in them 
so as to minimize the maximum distance of any city from its closest warehouse. 
We examine several approximation algorithms that achieve approximation factor 
of 2 as well as other heuristic algorithms. In particular, we focus on the clustering 
algorithm by Gonzalez, the parametric pruning algorithm by Hochbaum-Shmoys, 
and Shmoys’ algorithm. We discuss several variants of the pure greedy approach. 
We also describe a new heuristic algorithm for solving the dominating set prob- 
lem to which the /c-center problem is often reduced. We have implemented all the 
algorithms, experimentally evaluated their quality on 40 standard test graphs in 
the OR-Lib library, and compared their results with the results found in the recent 
literature. 

1 Introduction 

Problems of finding the best location of facilities in networks or graphs 
abound in practical situations. One of the well known facility location prob- 
lems is the vertex k-center problem, where given n cities and distances be- 
tween all pairs of cities, the aim is to choose k cities (called centers) so that 
the largest distance of a city to its nearest center is minimal. More formally, 
the vertex /.-center can be defined as follows. Let G ~ iV^E) be a complete 
undirected graph with edge costs satisfying the triangle inequality, and k be 
a positive integer not greater than \V\. For any set S CV, and vertex v £V, 
we define d{v, S) to be the length of a shortest edge from v to any vertex in 
S. The problem is to find such a set 5 C V, where |5| < k, which minimizes 
maxy^v d{v^S). The vertex /c-center problem is iVP-hard [5]. 

A popular way to solve the /c-center problem consists of solving a series of 
set cover problems [2,4,9,10]. At each step, a threshold for the cover distance 
is chosen and it is checked whether all vertices can be covered within this 
distance using at most k centers; if so, the threshold is decreased, otherwise 
it is increased. (One can also use the dominating set problem instead of the 
set cover problem [7].) For example, Minieka [10] solved the /c-center problem 
as a series of set cover problems. More elaborate versions of this approach 
were described by Daskin [2,3], where also the maximum cover problem was 




372 



used, and Ellumni et al. [4] and Ilhan et al. [9], which applied more efficient 
definition of the problem. Usually, these set cover problems were solved with 
integer programming. Another way to solve the fc-center problem was recently 
given by Mladenovic et al. [11], where the tabu search^ variable neighborhood 
search and various greedy methods were used. The greedy method was also 
applied by Gonzalez [6], Hochbaum and Shmoys [8], and Shmoys [12]. The last 
three describe 2-approximation algorithms which are the best possible in the 
sense that no r-approximation algorithm exists with r < 2, unless P = NP 
[7]. (No approximation algorithm exists in case the triangular inequality does 
not hold, unless P=NP.) 

In the following we briefly describe various heuristics for the fc-center 
problem that are not based on the integer programming (Section 2). We then 
describe a new heuristic, which combines the greedy approach with solving 
the dominating set problem, and returns surprisingly good results (Section 
3). We have experimentally evaluated all these heuristics as well as the new 
one. The experimental results are given in Section 4. 

2 Heuristics 

By using greedy heuristics we often locate centers one by one until there are k 
centers. For the selection of the first center there may be several possibilities. 
For example, the center can be located at random, it can be the result of the 
1-center problem, or we can apply a heuristic n-times, n = \V\, each time 
with different starting vertex, and then choose the best of the solutions. These 
approaches will be called random, 1-center, and plus version, respectively. 

A very simple heuristic is the pure greedy method, where centers are lo- 
cated one by one so that the objective function is each time reduced as much 
as possible. For the selection of the first center we have implemented random, 
1-center and plus version. It is easy to see that the time complexity of this 
pure greedy method is O(fcn^). 

Another greedy heuristic for the fc-center problem was described by Gon- 
zalez [6], who was able to prove the approximation factor of 2. The algorithm 
builds final solution in k steps so that, given a partial solution Ci~i, it forms 
a new partial solution Ci by extending Ci-i with the vertex v which is the 
farthest from the Ci-i, i.e. the vertex v which maximizes d{v, Ci-i) at step i. 
We have implemented random, 1-center, and plus version of this algorithm. 
The time complexity of Gonzalez’s algorithm is 0{kn), 

Shmoys [12] briefly describes 2-approximation algorithm for the decision 
version of fc-center problem, i.e. where radius r is also given and the aim is to 
decide if there exist fc vertices so that the coverage distance from these vertices 
is at most r. The algorithm repeatedly chooses one of the remaining vertices 
V, adds it to the partial solution, and deletes all vertices whose distance to v 
is at most 2r. At the end, if the size of the solution exceeds fc, the algorithm 
outputs “no”, otherwise “yes”. We implemented two versions where either 





373 



random vertex or vertex with maximum degree can be chosen on each step. 
The algorithm for the optimization version of problem runs the algorithm for 
the decision version several times with increasing value of r. Time complexity 
of this algorithm is O(fcn^). 

Hochbaum and Shmoys [8] introduced the algorithmic technique called 
parametric pruning for solving fc-center problem. Initially, edge costs are 
sorted in nondecreasing order. For each edge cost t the graph is pruned by 
removing edges with cost greater than t. The aim is to find a minimum dom^ 
mating set in the pruned graph, i.e. the smallest set S of vertices such that 
every vertex not in S is adjacent to one of the vertices in 5. If the cardinality 
of the minimum dominating set of the pruned graph is at most A:, then such 
a dominating set is also the optimal solution for fc-center problem. 

Unfortunately, to compute the minimum dominating set is A^P-hard op- 
timization problem [5]. Consequently, instead of searching for the minimum 
dominating set we rather search for the maximal independent 5cP, i.e the 
subset S of V such that no two vertices of S ate connected in G and no 
vertex can be added to S while S retaining this property. 

Define the square of the graph G to be the graph G^ containing an edge 
(-U, v) whenever G has a path of at most two edges between u and v^u^v.lt 
is well known that every maximal independent set is also dominating set. The 
fact that the cardinality of the maximal independent set of G^ is at most the 
cardinality of the minimum dominating set of G can be used to construct a 2- 
approximation algorithm for solving the fc-center problem. More specifically, 
instead of searching for the minimum dominating set of G the algorithm 
constructs the maximal independent set of G^. The overall time complexity 
of the algorithm is estimated to be 0{kn^). 

3 Elimination Heuristic 

We designed a new algorithm for the fc-center problem. The algorithm is based 
on the standard approach that solves a series of dominating set problems. 
First, edge costs are sorted in a nondecreasing list which is used for getting 
the threshold values r and for solving the series of dominating set problems. 
When the cardinality of the dominating set S becomes at most A:, the set S is 
returned as the result of the fc-center problem and the algorithm is completed. 

For solving the dominating set problem we developed a new heuristic 
algorithm. Informally, a pair of numbers {c{v),s{v)) is initially assigned to 
each vertex v G V, where c{v) {cover count) is the number of vertices that 
can cover v within distance r, while s{v) {vertex score) is used in the following 
selection process {s{v) is initially set to c(^;)). At each step of the selection 
process the vertex v with the smallest s{v) is chosen. If there is a vertex 
u gV such that d{u, v) < r A c{u) = 1, then v is added to the set S of centers; 

^ Maximum independent set is TVP-hard problem, while maximal independent set 
is one of the suboptimal solutions. 





374 



otherwise, s{u) is incremented for all u eV for which d{u^v) < r. Next, the 
cover count c{u) is decremented for all vertices u£V with d(u, v) < r. These 
steps are repeated until all the vertices of the graph have been processed. 

Notice, that we use c(i;) to ensure that every vertex is covered at least 
once. In addition, the way c{v) is used makes it possible to easily adapt the 
algorithm for solving the (fault-tolerant) a-neighbor fc-center problem where 
every node must be covered by at least a centers. Moreover, one can adapt the 
algorithm to solve the minimum set cover problem instead of the minimum 
dominating set problem. This is useful when solving the so-called fc-supplier 
problem, where k centers must be chosen from a predefined set of vertices. 

4 Experimental Results 

We tested the described algorithms on 40 OR-Lib test problems, which were 
originally designed for testing p-median problems [1]. The number of vertices 
ranges from 100 to 900 while k ranges from 5 to 90. The preprocessing phase 
runs the all shortest paths algorithm (time complexity 0(n^)). 

All the algorithms were implemented in Borland Delphi 6.0, and were 
tested on a computer with Intel processor running at 1.7 GHz with 512MB 
of system memory. Designations and names of algorithms appear in Table 1. 
Although our primary aim was to compare the quality of the solutions, let us 
mention that Gonzalez algorithms were the fastest (running below 1 second). 
The average time of HS was about 100 seconds. The pure greedy methods 
were quite fast (about 2 seconds on average), but their execution time was 
very variable and dependent on the parameter fe, Shmoys’ variants were also 
fast as well as our Scr. (Notice that plus variants run much slower due to the 
algorithm which tries all vertices for the first center.) 

Recall that approximation factor is the ratio between approximated and 
the optimal objective value. Since sometimes we do not know the optimal 
solution, we take the best known so far. We call such a ratio an approximation 
degree. Nevertheless, in our case most of the best known objective values were 
proved to be optimal (see for example [4,9]). Approximation degrees for each 
algorithm are given in Table 1 below, where we also included the results for 
Daskin’s and tabu search approach. 

Objective values for all of the 40 problems are in Table 2. The pure greedy 
method is the worst, while only slightly better results were with the plus vari- 
ant. The solution quality strongly depends on the parameter fc, and is much 
better for low values of k. Gonzalez algorithms are very fast and solutions 
are about 50% worse than best known. The algorithms HS, ShR, ShD ex- 
ploit very similar problem properties, and consequently return very similar 
results. Our algorithm Scr proved to be quite competitive since it achieved 
better results than any of the implemented algorithms, with the exception 
of the integer programming approaches [3,4,9] and tabu search and variable 
neighborhood search [11]. 





375 



Algorithm 


Level 


Deviation 


Description 


GrR 


1,697 


0,559 


Pure greedy first random 


Grl 


1,675 


0,570 


Pure greedy first 1-center 


Gr+ 


1,512 


0,550 


Pure greedy plus 


GonR 


1,495 


0,130 


Gonzalez first random 


Coni 


1,398 


0,128 


Gonzalez first 1-center 


Gon-h 


1,317 


0,139 


Gonzalez plus 


ShR 


1,432 


0,112 


Shmoys random 


ShD 


1,343 


0,105 


Shmoys degree 


HS 


1,462 


0,177 


Hochbaum- Shmoys 


Scr 


1,058 


0,043 


Elimination heuristic 


Das 


1,002 


0,007 


Daskin 


TS 


1,025 


0,045 


Tabu search 



Table 1. Approximation degrees 



References 

1. J. E. Beasley. A note on solving large p- median problems. European J. Oper. 
Res., 21:270-273, 1985. 

2. Mark S. Daskin. Network and Discrete Location: Models Algorithms and Ap- 
plications. Wiley, New York, 1995. 

3. Mark S. Daskin. A new approach to solving the vertex p-center problem to 
optimality: Algorithm and computational results. Communications of the Op- 
erations Research Society of Japan, 45:9:428-436, 2000. 

4. Sourour Elloumi, Martine Labbe, and Yves Pochet. New formulation and res- 
olution method for the p-center problem. 2001. 

5. M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the 
Theory of NP- Completeness. W. H. Freeman and Co., San Francisco, 1979. 

6. T. Gonzalez. Clustering to minimize the maximum inercluster distance. The- 
oretical Computer Science., 38:293-306, 1985. 

7. Dorit S. Hochbaum, editor. Approximation Algorithms for NP-hard Problems. 
PWS publishing company, Boston, 1995. 

8. Dorit S. Hochbaum and David B. Shmoys. A best possible heuristic for the 
k-center problem. Mathematics of Operations Research, 10:180-184, 1985. 

9. Taylan Ilhan and Mustafa Pinar. An efficient exact algorithm for the vertex 
p-center problem. 2001. 

10. E. Minieka. The m-center problem. SIAM Rev., 12:138-139, 1970. 

11. N. Mladenovic, M. Labbe, and P. Hansen. Solving the p-center 
problem with tabu search and variable neighborhood search. 2000. 
http://smg.ulb.ac.be/Preprints/Labbe00_20.html. 

12. David B. Shmoys. Computing near-optimal solutions to combinato- 
rial optimization problems. Technical report, Ithaca, NY 14853, 1995. 

http://citeseer.nj.nec.com/shmoys95computing.html. 





376 



GrR Grl GrP Gon Gonl Gon+ HS ShR ShD Scr 



# n 


k Best 


Das TS 


1 


100 


5 


127 


127 


127 


2 


100 


10 


98 


98 


98 


3 


100 


10 


93 


93 


93 


4 


100 


20 


74 


74 


74 


5 


100 


33 


48 


48 


48 


6 


200 


5 


84 


84 


84 


7| 


200 


10 


64 


64 


64 


8l 


200 


20 


55 


55 


55 


9 


200 


40 


37 


37 


37 


10 


200 


67 


20 


20 


20 


11 


300 


5 


59 


59 


59 


12 


300 


10 


51 


51 


51 


13 


300 


30 


35 


36 


36 


14 


300 


60 


26 


26 


26 


15 


300 


100 


18 


18 


18 


16 


400 


5 


47 


47 


47 


17 


|400 


10 


39 


39 


39 


18 


1400 


40 


28 


28 


28 


19 


400 


80 


18 


18 


19 


20 


400 


133 


13 


13 


14 


21 


500 


5 


40 


40 


40 


22 


500 


10 


38 


38 


38 


23 


500 


50 


22 


22 


23 


24 


500 


100 


15 


15 


16 


25 


500 


167 


11 


11 


12 


26 


600 


5 


38 


38 


38 


27 


600 


10 


32 


32 


32 


28 


600 


60 


18 


18 


19 


29 


600 


120 


1 13 


13 


13 


30 


600 : 


200 


9 


9 


11 


31 


700 


5 


1 30 


30 


30 


32 


700 


10 


29 


29 


29 


33 


700 


70 


15 


15 


16 


34 


700 : 


140 


11 


11 


12 


35 


800 


5 


30 


30 


30 


36 


800 


10 


27 


27 


27 


37 


800 


80 


15 


15 


16 


38 


900 


5 


29 


29 


29 


39 


900 


10 


23 


23 


24 


40 


900 


90 


13 


13 


14 



143 


133 


133 


186 


162 


117 


117 


no 


131 


124 


126 


116 


106 


154 


133 


127 


127 


92 


114 


99 


87 


87 


78 


71 


64 


98 


94 


89 


138 


99 


78 


79 


77 


96 


87 


72 


72 


72 


82 


72 


73 


73 


63 


57 


51 


44 


44 


38 


31 


29 


68 


67 


61 


73 


68 


62 


72 


56 


71 


70 


64 


64 


52 


59 


51 


60 


60 


46 


40 


36 


42 


42 


40 


25 


25 


52 


51 


47 


84 


55 


50 


50 


43 


56 


51 


50 


50 


42 


44 


41 


40 


40 


31 


28 


28 


32 


32 


32 


19 


19 


48 


48 


42 


53 


51 


48 


49 


43 


56 


54 


41 


41 


35 


34 


33 


35 


35 


32 


23 


23 


27 


27 


27 


15 


15 


44 


43 


39 


50 


47 


37 


39 


35 


43 


42 


33 


33 


27 


28 


28 


34 


36 


34 


19 


19 


29 


29 


29 


14 


14 


35 


34 


31 


42 


38 


35 


35 


32 


45 


43 


32 


26 


24 


26 


25 


30 


30 


27 


17 


17 


37 


32 


31 


38 


37 


34 


34 


30 


41 


41 


26 


26 


26 


25 


24 


42 


35 


31 


36 


38 


27 


28 


25 


35 


35 


25 


22 


22 


21 


20 



155 


184 


188 


171 


133 


117 


160 


128 


135 


109 


124 


160 


140 


120 


99 


92 


124 


109 


84 


83 


62 


77 


62 


59 


48 


98 


126 


138 


106 


90 


85 


90 


88 


90 


70 


71 


84 


74 


68 


60 


49 


62 


50 


52 


38 


29 


32 


28 


28 


20 


68 


82 


73 


74 


60 


66 


78 


74 


70 


53 


49 


60 


54 


52 


38 


36 


44 


36 


34 


27 


23 


30 


22 


20 


18 


52 


64 


83 


58 


48 


48 


56 


56 


52 


41 


39 


46 


40 


38 


31 


27 


30 


26 


24 


20 


17 


22 


18 


16 


14 


45 


52 


53 


45 


40 


47 


54 


54 


48 


41 


32 


36 


32 


30 


24 


21 


24 


22 


20 


17 


15 


18 


16 


14 


11 


43 


52 


50 


52 


41 


55 


42 


44 


44 


33 


25 


28 


28 


28 


20 


18 


22 


18 


18 


13 


13 


16 


12 


12 


10 


36 


40 


42 


44 


30 


37 


40 


44 


40 


31 


23 


26 


24 


22 


17 


16 


18 


16 


16 


11 


34 


40 


38 


38 


32 


34 


38 


42 


38 


28 


23 


24 


22 


22 


16 


31 


38 


40 


38 


29 


28 


32 


36 


34 


24 


19 


22 


20 


20 


14 



Table 2. Objective function values 




MaxFlow-MinCut Duality for a Paint Shop 
Problem 



Thomas Epping^, Winfried Hochstattler^ and Marco E. Liibbecke^ 

^ Department of Mathematics, BTU Cottbus, 03013 Cottbus, Germany 
{epping , hochstaettler}Qmath . tu~cottbus . de 
^ Department of Mathematical Optimization, TU Braunschweig, Germany 
m. luebbecke®tu-bs . de 



Abstract. Motivated by an application in car manufacturing we consider the fol- 
lowing problem: How can we synthesize a given word from restricted reservoirs of 
colored letters with a minimal number of color changes between adjacent letters? 

We focus on instances in which each letter occurs exactly twice, once in each 
of two given colors. In this case the problem turns out to be the dual of a MinCut 
problem for one point extensions of a certain class of regular matroids. 

We discuss consequences of the MaxFlow-MinCut duality and describe algo- 
rithmic approaches. 



1 Introduction 

A sequence of car bodies has to pass various shops during the production 
process, including a press and a body shop, a paint shop, and an assembly 
shop. The daily sequencing of the car bodies has to be done with respect to the 
minimization of the specific objective function of each of these shops. We focus 
on the paint shop, where the objective function consists in the minimization of 
the number of color changes that occur whenever two consecutive car bodies 
have to be colored in different colors, giving rise to non-negligible costs and 
water pollution. 

The number of color changes may be reduced by the use of interim stor- 
age systems (see [1]) that permute a car body sequence before it enters the 
paint shop. However, the efficiency of succeeding production shops may also 
significantly decrease for the permuted sequence. We therefore consider the 
car body sequence to be a fixed external parameter. 

Together with the fact that current technology heads for the detachment 
of car bodies and their features (what allows us to uncouple car bodies and 
enamel colors), the minimization of color changes for a car body sequence 
yields a new type of combinatorial problem. 

We first give a formal problem description, a review of complexity results, 
and previous results and conjectures on the minimal number of color changes 
for regularly structured instances. Among these, we examine particular in- 
stances from a matroid point of view in more detail. Our notation is fairly 
standard. 




378 



2 Problem Formulation and Previous Results 

We are given a fixed sequence of car bodies together with a set of orders that 
specify the demand of each car body type in each color. We assume that these 
orders are given by an inital coloring of the car body sequence. We associate 
a letter of an alphabet S with each car body type and represent a sequence 
of n car bodies by a word w E A coloring of w is represented by a vector 
/ G for some color set F, where fi denotes the color of Wi for all i. We 
say that we have a color change within / whenever fi ^ fi^i. 

Our problem consists in finding a permutation that minimizes the number 
of color changes within / and leaves the sequence of letters in w unchanged. 

Problem 1. Paint Shop Problem for Words (PPW) 

Instance A finite alphabet F, a word w = (u^i, . . . , Wn) G a finite color 
set F, and a coloring / = (/i , . . . , f^) € F'^ of w. 

Question Find a permutation cr : {1, . . . , n} {1, . . . , n} such that = 
for i = 1, . . . , n, and the number of color changes within a(f) = 
(/tr(i), • • • , fa{n)) is minimized. 

The following complexity results hold. 

Theorem 1 ([2]). Any instance of PPW can be solved by a dynamic program 
with a space and time complexity PPW is NP-complete 

for either |F1 — 2 or \S\ = 2, that is, if at least one parameter is unbounded. 

2.1 Upper Bounds for Regular Instances 

We now turn to structured instances of PPW. In the following, we denote 
an instance of PPW by {w; /), and the minimal number of color changes for 
{w; f) by /). Recall that the coloring / determines reservoirs of colored 
letters for {w]f). We denote the reservoir of letter i in color j by R{i,j). 

Definition 1. Given a fixed integer A: > 1, we call an instance {w; f) of PPW 
fc-regular, if R(i,j) = for all letters i and colors j. 

The following upper bounds on 'y{w; f) hold for fc-regular instances. 

Lemma 1 ([2]). If \E\ = |F| = 2, then j{w;f) < 2. If 1F| = 2, then 
7K/)<2(|F|-1). 

We conjecture the following general upper bound on 'y{w;f). A linear 
algebra argument yields the correctness of Conjecture 1 for fc = 1. 

Conjecture 1 ([2]). For a fc-regular instance holds j{w;f) < |F|(|F| — 1). 

Theorem 2. For a 1-regular instance holds j{w; f) < |i^|(|F| — 1). 

Examples (see [2]) show that the bound given in Conjecture 1 is tight if 
the conjecture is correct. Natural solution approaches like a greedy coloring 
algorithm or an improvement algorithm based on color exchanges of letters or 
letter blocks fail to yield or even approximate an optimal solution in general. 





379 



3 1-Regular Instances and Two Colors 

This section focuses on 1-regular instances with a color set of size |F| = 2, 
thus every letter is available exactly once in each color. We do not know 
whether PPW is polynomially solvable when restricted to such instances, 
but we give some indications that might support such a conjecture. In the 
following we denote by 0 resp. 1 the vector (of appropriate dimension) of all 
zeros resp. ones. 



ABC ADDEBEC 



(a) 



/I 1 1 0 0 0 0 0 0\ 
011111100 
001111111 
000010000 
\ 0000001 10 / 

(b) 



Fig. 1. An example instance and the associated matrix A 



First we associate an interval I{h) to each letter b e S, where I{b) runs 
from the first occurence of 6 in it; to the second (see Figure 1(a)). We may 
consider each I{b) being an interval on the real line. The restriction of PPW to 
1-regular instances with \F\ = 2 is then equivalent to the following problem. 

Problem 2. Odd Intersection of Intervals 

Instance A set of closed intervals on the real line, where no two intervals 
share a common endpoint. 

Question Find a minimal set of points on the real line that intersects each 
interval in an odd number of points. 

We denote the set of intervals of an instance {w\ f) of Problem 2 by I{w] f) 
and interpret each point that intersects one or more intervals of I{w; /) as a 
color change or a cut between two adjacent letters of w. In the most easiest 
case, I{w] /) is a clutter. 

Lemma 2. If I {w',f) is a clutter, then f) can be computed by a greedy 
algorithm. 

Otherwise, we consider the (|Z'| x n - l)-matrix A (siehe Figure 1(b)) that 
is defined by 

{ 1 , if a cut between Wj and wj.^i intersects I{i), and 
0 , otherwise. 

Obviously A has the consecutive ones (CO) property for rows (see [3]). Thus, 
A is totally unimodular. Furthermore, A contains at least all maximal cliques 
of the interval graph of I{w; /). Note that any instance of Problem 2 can be 
solved by the integer program depicted in Figure 2(a). 





380 



l^x — > min! 
Ax — 2Iy = 1 

Xi G {0,1} 
Vi € N>o 



l^x min! 
Ax - 2Iy = 6 

Xi G {0,1} 
yi G N>o 



(a) 



(b) 



Fig. 2. Integer programs for the solution of Problem 2 



3.1 Lower Bounds for the Number of Interval Cuts 

The fact that A is a node-clique matrix of an interval graph enables us to 
compute a lower bound on the number of intersections of each interval in an 
optimal solution for Problem 2. Therefore we consider the partial ordering on 
I{w;f) given by proper containment, i. e. 7(6') < 7(6) :<^ 7(6') ^ 7(6) and 
7(6') is properly contained in 7(6). We define the set C{b) := {7(6') : 7(6') < 
7(6)} for all 6 G Z" and assign a lower bound of 1 to all 7(6) with (7(6) = 0. 
Then, we compute a lower bound for an interval 7(6) whenever all intervals 
in (7(6) are already assigned a lower bound. Therefore we have to compute 
the value of a maximum weighted stable set in (7(6), where the weight of each 
interval is given by its lower bound. The result has to be rounded up to the 
next odd integer. Note that the maximum weighted stable set problem can 
be solved in polynomial time on interval graphs by computing a maximum 
weighted clique on its complement graph (see [3]). 

Theorem 3. Let P resp. P' denote the IP shown in Figure 2(a) resp. (b), 
and let F{P) denote the set of vectors feasible for P. Then the set of vectors 
feasible for P' is given by F{P') = {{x,y - |(6 - 1)) : {x,y) G F{P)}. 

Thus a feasible solution for P' can be derived from a feasible solution for 
P (and vice versa) by changing only the vector y. In the following we denote 
the vector of lower bounds, computed as described above, by 6. Note that we 
get (in addition to Theorem 2) a lower bound on f) if we enclose w by an 
extra letter A and compute the lower bound on the number of intersections 
of 7(A), this time without rounding the result up to the next odd integer. 

3.2 A Dual Pair of Linear Programs 

In this section we apply the ”Big-M” -method and consider the dual pair of 
LPs depicted in Figure 3. Recall that A (and thus (A, 7)) is totally unimodu- 
lar, so both LPs have integer solutions. We call an optimal solution (x*,y*) 
of the primal LP P- feasible (P-optimal), if x* is a feasible (optimal) solution 
for Problem 2. Note that any (x*,y*) with y* = 0 is P-feasible, but not 
necessarily P-optimal. 

Theorem 4. If {x*,y*) is an optimal solution of the primal LP with y* = 0, 
then X* G {0, 1} for all i. 





381 



l^x + M^y 



mm! 
Ax — ly = b 
x,y > 0 



vrb 



max! 

1 

u > —M 



u^A < 



Fig. 3. The primal LP and its dual 



If (x*^y*) is not jfZ-optimal, the lower bound b is not tight. A rough 
statistic shows that the percentage of such instances increases from less than 
5% for \S\ = 5 to more than 90% for |i7| = 25. For example, both instances 
shown in Figure 4 have a lower bound of 6 = 1, which is not tight. In such 
cases we are searching for an adaption of b so that the solution of the primal 
LP yields an i?-optimal {x*,y*). Due to recent computational experiments 
we conjecture the following. 

Conjecture 2. Suppose that b is not tight and let U := {j : Uj < 0} denote 
the index set of negative dual variables after the solution of the primal LP. 
Then there exists I C S so that I C U, and the primal LP yields an R- 
optimal solution if we increase bi by 2 for all i G I. 

Figure 1(a) shows an example of an instance for which the primal LP 
yields an ii-feasible solution and an objective value of 4, while we get 7 (ty; /) = 
3 if we increase the lower bound on the number of intersections of I{B) by 2. 



3.3 MaxFlow-MinCut Duality 

We get an even stronger duality than the LP duality described in Section 3.2 
if we formulate Problem 2 in terms of matroid theory. 

Therefore, we identify the set of feasible solutions to Problem 2 with 
the set of feasible solutions of the equation Ax = 1 mod 2 over GF(2). If 
we replace A by (A, 1), this equation is equivalent to {A^l)x = 0 mod 2, 
if we demand that Xn = 1 always. Thus we are seeking a minimal element 
in the kernel of (A, 1), or, in other words, a shortest circuit in the binary 
vector matroid (A, 1), that contains the element 1. This yields an equivalent 
formulation of Problem 2. 

Problem 3. Shortest circuit in a clutter 

Instance A matrix A with the CO property for rows, where no two se- 
quences of consecutive ones start or end in a common column. 
Question Find a shortest circuit in the clutter of all circuits of the binary 
vector matroid (A, 1) that contain the element 1. 

Note that computing a shortest circuit in a binary matroid is NP-complete 
in general. Furthermore, recall that A is totally unimodular. Thus (A, 1) 
is a one point extension of a regular matroid. We cite the dual version of 
Seymour’s famous MaxFlow-MinCut theorem. 





382 



Theorem 5 (Seymour [4]). Given a binary matroid M = (E,I) and a 
specific element e £ Ej the value of a maximum disjoint packing of cocircuits j 
each of which contains e, equals the value of a minimum circuit that contains 
e for all length functions f : E if and only if M has no Fj -minor 

that contains e. 

Now we consider the binary vector matroid M = (^4, 1) with e = 1 and 
fix the length function to / = 1 except for /(I) = 0. Then the length of a 
shortest circuit that contains 1 corresponds to the minimum number of color 
changes for any instance of Problem 2. Note that we are allowed not only to 
pack disjoint rows, but also odd sums of symmetric differences of rows. 

Theorem 6. If (A, 1) has no Fj-minor that contains 1, then the maximal 
value of a disjoint odd row sum packing equals the minimum number of color 
changes for any instance {w; f) of Problem 3. 

For example, the instance shown in Figure 4(a) has a value of f) = 4, 
and (A, 1) contains no Fy-minor. A maximal odd row sum packing is given by 
{/(A), /(B), /(C)}, {/(C), /(B), /(F)}, {/(B)}, and {/(F)}. The instance in 
Figure 4(b) does contain an Fy-minor (contract the first column, add /(F) 
to /(C), and contract the eighth column within (A, 1)). It has a value of 
^{w\ f) = 3, whereas a maximal odd row sum packing consists of only two 
disjoint odd row sums ({/(B)} and {/(F)}). 

ABBCDACEED ABCDBEDAEC 



(a) 



(b) 



Fig. 4. Instances of Problem 3 without (a) and with (b) an Fr-minor in (A, 1) 



References 

1. Epping Th., Hochstattler W. (2002) Storage and Retrieval of Car Bodies by the 
Use of Line Storage Systems. Technical report btu-lsgdi-001.02, BTU Cottbus, 
Germany 

2. Epping Th., Hochstattler W., Oertel P. (2002) Complexity Results on a Paint 
Shop Problem. Submitted to: Discrete Applied Mathematics 

3. Golumbic, M. C. (1980) Algorithmic Graph Theory and Perfect Graphs. Aca- 
demic Press 

4. Seymour P. D. (1977) The Matroids with the Max-Flow Min-Cut Property. 
Journal of Combinatorial Theory, Series B 23, pp 189-222 

5. Spieckermann S., Vofi S. (1996) Paint Shop Simulation in the Automotive In- 
dustry. ASIM Mitteilungen 54, pp. 367-380 

6. Oxley, J. G. (1992) Matroid Theory. Oxford University Press 





From Edge Decomposition Formulae 
to Composition Algorithms 



Andre Ponitz 
Hochschule Mittweida 



Abstract For some graph invariants simple decomposition formulae are known. 
Unfortunately, a straight forward implementation of such formulae usually leads to 
algorithms exponential in the number of edges or nodes of the graph even if the 
graph has a structure that would allow polynomial algorithms. This paper describes 
a way to derive polynomial algorithms for graphs of bounded pathwidth from edge 
decomposition formulae by using a variant of the composition method originally 
proposed by the author for the computation of graph invariants in graphs of small 
bandwidth. 



1 Introduction 



For several graph invariants and other graph-related values decomposition 
formulae are known. So we know e.g. for the all-terminal reliability R{G) of 
a graph G that R{G) = Pe R{Ge) + (1 -Pe) R{G-e) holds with G_e denoting 
the minor derived from G by deleting some edge e, Ge the minor derived 
by contracting, and Pe the working probability of edge e. Similarly, we have 
N{G) = X N{G-e)-\-y N{Ge) for the Negami polynomial or x{G) = x{G-e)~ 
x{Ge) for the chromatic polynomial. For other invariants like the residual 
connectedness reliability node decomposition formulae are known. 

The existence of polynomial algorithms for the computation of such invariants 
in graphs of bounded pathwidth (and even for the larger class of graphs of 
bounded treewidth, see e.g. [And95,Bod97]) is well-known; it is rather the 
simplicity and performance of algorithms created by the composition method 
which makes our approach appealing. Starting from a decomposition formula 
and assuming cost of 0{uj{n)) for elementary operations on the values, a 
working 0((m + n)n{uj{n) -f- logn)) (n being the number of nodes, m the 
number of edges in the graph) implementation can be obtained completely 
automatically within a few minutes. 

After introducing some essential notation in the next section, we will give a 
description of the algorithm, a sketch for proving its correctness, an evaluation 
of its complexity, and two examples. 




384 



2 Notation 

Let G = (V,E) be a finite, undirected graph, n = |F|, m = \E\ and t = 
2n -f m. A sequence S = Si = (xi^Ci) with an element Xi D E^ 

and a code C{ E {A, K, D} is called a composition sequence if the following 
holds: 

Vu € V{G)3va,vd E {l...t} :vA<i<VDASy^- (v,A) A Sy^ = 



'iE - (uv) E E(G)3uAyVAyk,unyVD E {l...t} : 

ua-)^a k ud^vd a Sy^ — (u. A) A (u,A) (1) 

A Sfc = (e,K) A s„o = {v,D) A s„j, = {u,D) . 

A composition sequence induces a sequence (A^) of active node sets by 

Ai = {ve V{G) : 3va < vb : = {"A) A s„d = • 

The width wd(S) of a composition sequence S is defined as 

wd(5) = m^ card Ai . 



The set of all composition sequences for a given graph will be denoted by 
S{G). It is easy to see that for a graph of pathwidth p there is a composition 
sequence of width p as one can be derived from a path decomposition of the 
graph. 

Given some graph invariant R which takes values in some ring W, we call 
a triplet T — {a,h,r) of functions a^b : E{G) -> W, r : Z W an edge 
decomposition formula for the invariant R if the following holds: 

R{G) = a{e) R{Ge) + b{e) R(G^e) (2) 

R{^) = r(n), r(0) = 1 . (3) 

In this formula, Kn denotes an edgeless graph on n nodes. 

Let F(y) denote the set of all set partitions of subsets of the node set V. A 
pair [7T,d] : 7T G P(F),d G Z will be referred to as an index^ an element w of 
>V as a value. Moreover, a pair z = (i,w) of an index i = [tt, c/] and a value w 
will be called a state. A multiset Z of states will be called a stateset, the set 
Zq = {([0jO],O)} the initial stateset. Z shall denote the set of all statesets. 

Furthermore, let 7 t|u denote a partition consisting of the blocks of tt and an 
additional block containing only u, 7T(^^) a partition which results from tt 
by merging the blocks containing u and u, the block of tt containing u, 
and 7T — u the partition tt with node v removed from tt^. Finally, for an edge 





385 



e = {uv) let [tt, cf|(e) denote [7T(^^), d ] where d = dif and d = d - I 

otherwise. 

Given a graph G and an edge decomposition formula T, a function / : S{G) x 
Z Z, f{si, Zi-i) = Zi is called a transfer function if 

f({v,A),Z) = U([„,d],t„)ez{([7r|t;,d + l],w)} 

/((e, K),Z) = U([„,d],„)gz {([tt, d](e) , a{e) w), ([tt, d\, 6(e) «;)} 
f{(v, D), Z) = ^([^,diw)ez {([tt - V, d\,w)] 

for all Z G ^ and 

f {st,...f {S 2 J (si.Zq)) ...) = f* = const 



for all composition sequences S = (si) E S(G) of the given graph G. 



3 The Algorithm 

The following algorithm can be used to compute a graph invariant given by 
some decomposition formula T for a graph instance G: 

input G . . . a graph, T . . . a decomposition formula 

compute some composition sequence S{G) = {si) 

set Zo = {([0,O],l)} 

for i from 1 to t do 

set Zi = collect (/(sj, Zi-i)) 

output 

In this algorithm, we use a function collect: Z Z^ Z Z that replaces 
states with the same index in Z with a single state of the same index in 
Z . The value of the new state equals the sum of the values of the replaced 
states. This collection step does not actually change the correctness of the 
algorithm, but it is crucial for the performance as it guarantees that statesets 
never contain more than one state for a given index at the end of each iteration 
of the loop. 



3.1 Correctness 

Let G = (y, E) a graph. If we apply (2) repeatedly, we end up with a sum of 
the form 

R{G)= r{kiG:X))pE{X) 

XCE 



( 4 ) 





386 



with k{G:X) denoting the number of connected components of G:X = (V, X) 
and pe{X) being an abbreviation for fleex ^ rieG£^\A' 6(e). 

Let us fix some subset A of V. We call the index [^, d\ induced by X C E in A 
(written as [tt, d\ = ind^iX) if there are d connected components in G:X and 
the blocks of tt correspond to the intersection of A with G:X. By collecting 
the terms in (4) into blocks with the same induced index we obtain 

R{G) = ^r{d) Y. 

[7T,d] XCE :\ndAX=[7T,d] 



By introducing the set 



desc,4(G) 



n,d\, Y 

XCE:mdAX=lTT,d] 



(5) can be written as 

= Y ^ • 

([7T,d],zi;) 6 descA(<^) 



With Gi = {Vi,Ei) being the subgraph of G which consists of all nodes 
activated and edges handled in the first i steps of S and Ai the active 
node set we are prepared to state the following lemma: 

Lemma: The stateset Zi after step i equals descAi{Gi). 

Proof (Sketch): In the beginning we have Aq — ^ and an empty graph Gq 
with zero connected components. Consequently there is only a single index 
[0,0], and the products in (5) both range over empty sets, so their value 
is 1. This coincides with the initial stateset Zq = {([0?O]?1)} used by 
the algorithm. The proof continues by assuming Zi-i — descAi_i (C?i-i) and 
distinguishes three cases according to the three possible values for the code 
d in step si = {xi,Ci). In each case Zi is built according to the rule given in 
the algorithm and by a sequence of simple operations this can be shown to 
be equal to Zi — desc Ai{Gi). These operations are elementary, albeit a bit 
lengthy, so the details should be skipped here. 



3.2 Complexity 

Let us assume we are given a composition sequence S £ S{G) of some graph 
G of width p, some decomposition formula T = (a, 6, r) and assume further 
that an operation like addition or multiplication of two values as well as the 
evaluation of a, 6, and r can be done in 0{oj{n)) time. 





387 



For each node of the graph we have an activation and a deactivation step, 
and for each edge an edge step, resulting in a total of 0(n + m) iterations in 
the outer loop of the algorithm. 

In each iteration, some stateset is transformed by looping over all states. The 
index of such a state consists of a set partition of the current set of active 
nodes (as in the node deactivation steps nodes are removed from the indices) 
and an integer between 0 and n. The number of set partitions over p nodes is 
given by the pth Bell number, which is, given bounded p, bounded by some 
constant. As the collection step replaces multiple states with the same index 
by one state, there cannot be more the B{p) (n -f- 1) different indices - and as 
a result of the collect function no more states than that - at the beginning 
of the transformation of the stateset. 

Each individual state transformation consists of a bounded number of simple 
operations. The most difficult ones include the melting of two blocks of a 
given set partition of a set of maximum cardinality p, which can be done 
in unit time for bounded p, and at most four operations on values taking 
time of at most 0{uj) leading to a maximum complexity 0{nuj) for the state 
transformation. 

Each state gets transformed to at most two new states such that the total 
number of states produced in can’t be larger than 2 B{p) (n -f 1) i.e. 0{n). In 
order to perform the collection step after the transformation, we can sort the 
stateset in 0{n logn) and collapse sequences of states with identical index in 
a total of 0{n(jj). So the total amount of work for one iteration is 0{n (log n+ 
cj)), leading to a total for the whole algorithm of 0(n (n + m) (logn H- a;)). 

Depending on the actual nature of value operations or the level of abstraction 
one could assume co = 0(1) and n = 0{m) which leads to 0{n? logn). 



4 Examples 

Negami polynomial: The triplet T = (a, 6, r) with a{e) = x, 6(e) = y and 
r(n) = is a description of the Negami polynomial (see e.g. [Neg87]). Some 
experimental results for p x p grids are shown in the following table with 
“#total” denoting the total number of states, “#max” the maximum size of a 
stateset, ‘liime” the approximates running time in minutes on a AMD 1700+: 



p 


#total 


#max 


time 


#total 


#max 


time 


6 


125.898 


4.770 


0:05 


10.087 


264 


0:04 


7 


790.878 


22.760 


1 : 03 


44.032 


858 


0:40 


8 


4.078.059 


93.252 


17:04 


179.910 


2.860 


5:30 



The second block of results show the results of an manually improved al- 
gorithm which uses a slightly different transfer function that leads to fewer 





388 



states during the course of the algorithm (but, of course, identical final re- 
sults). 

As there is a well-known linear transformation between the Negami polyno- 
mial and the Tutte polynomial of a graph, we get composition algorithms for 
the Tutte polynomial as well as for several invariants derived from the Tutte 
polynomial without much additional work. 

All-terminal reliability; The triplet T = (a, 6,r) with a(e) = Pe? = 
1 — Pe? ^(0) = ^(1) = 1 3.nd r(n) = 0 for n > 1 constitutes a decomposition 
formula for the all-terminal reliability. For p x p grids we obtain: 



p 


# total 


# max 


time 


result 


9 


745.202 


9.724 


0 


03 


1.58e 


-5 


10 


3.092.058 


33.592 


0 


12 


2.24e 


-6 


11 


12.734.918 


117.572 


0 


52 


2.27e 


-7 


12 


53.276.048 


416.024 


4 


22 


2.85e 


-8 


13 


218.954.187 


1.485.800 


20 


36 


2.46e 


-9 


14 


917.861.182 


5.348.880 


95 


50 


1.97e - 


- 10 



The column “result” shows the actual all-terminal reliability for all edge failing 
independently with probability 0.9. It should be noted that the algorithm 
does work equally well if these probabilities are not equal for all edges. 



5 Conclusion 

The proposed algorithm is very easy to implement. If performance is crucial, 
it gives a decent starting point for further manual optimization. Some spe- 
cial graph structures automatically lead to better performance without any 
additional work as e.g. for planar graphs, only non-crossing partitions are 
created. The method seems to be easily generalizable to solve problems with 
more complex decomposition formulae and/or other kinds of descriptions like 
splitting formulae. 



References 

[And95] Artur Andrzejak (1995) A polynomial-time algorithm for compu- 
tation of the Tutte polynomials of graphs of bounded treewidth. 
http: / /citeseer. nj.nec.com/andrzejak95polynomialtime.html 
[Bod97] Hans Bodlaender (1995) Treewidth: Algorithmic techniques and results. 

Proceedings 22nd Intern. Symposium on Math. Foundations of Computer 
Science, MFCS’97, Lecture Notes in Computer Science, volume 12/95 
[Neg87] Seiya Negami (1987) Polynomial invariants of graphs. Transactions of the 
American Mathematical Society, 299: 601-622 
[Poe99] Andr^ Ponitz (1999) Computing invariants in graphs of small bandwidth. 
Mathematics in Computers and Simulation, 49: 179-191 





The Complexity of Some Problems 
on Maximal Independent Sets in Graphs 



Igor Zverovich^ and Yury Orlovich^ 

^ RUTCOR-Rutgers Center for Operations Research, Rutgers University, 
640 Bartholomew Rd, Piscataway, NJ 08854-8003, USA, 
e-mail: igor@rutcor.rutgers.edu 

^ Institute of Mathematics, National Academy of Sciences of Belarus, 

11 Surganov str, 220072 Minsk, Belarus, e-mail: orlovich@im.bas-net.by 



Abstract. Let mi(G) be the number of maximal independent sets in a graph G. 
A graph G is mi- minimal if mi{H) < mi(G) for each proper induced subgraph H of 
G. As it is shown in [6], every graph G without duplicated or isolated vertices has 
at most 2^“^ -h A: — 2 vertices, where k = mi(G) > 2. Hence the extremal problem of 
calculating m{k) = max{|U(G)| : G is a mi-minimal graph with mi(G) = k} has a 
solution for any A: > 1. We show that 2{k — 1) < m{k) < k{k — 1) for any A: > 2 and 
conjecture that m{k) = 2{k — 1). We also prove NP-completeness of some related 
problems. 



1 Introduction 

We consider finite undirected graphs without loops or multiple edges. Stan- 
dard graph-theoretical terminology not presented here can be found in [3]. If 
G is a graph, V{G) (respectively, E{G)) is the vertex set (respectively, the 
edge set) of G. We denote by n = [U (G)| the order of G. We write Kn for the 
complete graph of order n. The open neighborhood of a vertex x G V{G) is 
the set Ng{x) — N{x) = {y e V{G) : xy G E{G)}. The closed neighborhood 
of X, denoted by Ng[x] = Y[x], is the set Ng{x) U {a:}. In addition, we define 
Ng{X) = N{X) = D^^x^oix) and Ng[X] = N[X] = Ng{X) U X for a 
subset X of V{G). The subgraph of G induced by X is denoted by G{X). 
A clique in a graph G is a maximal (by inclusion) vertex set which induces 
a complete subgraph in G. A subset I C V{G) is independent if G(7) is an 
edgeless graph. A maximal independent set is an independent set that is not 
a proper subset of any other independent set. Let Ind(G) be the set of all 
maximal independent sets of a graph G and mi(G) = |Ind(G)|. 

Erdos and Moser raised the problem of calculating the function 

f{n) = max{mi(G) : |f^(G)| = n} 

and determining those graphs G on which this maximum value is achieved. 
This problem was solved by Erd5s, and later by Moon and Moser [8]. It 
was then extensively studied for various classes of graphs, including trees. 




390 



forests, graphs with at most one cycle, bipartite graphs, connected graphs, 
fc-connected graphs, triangle-free graphs; for a survey see [5]. 

The reverse problem of calculating 

g{k) = max{|F(G)| : mi(G) = A:} 

is trivial because \V{G)\ is not bounded above on IN(fc) = {G : mi(G) = k}. 
However, we can restrict the set IN(fc) to minimal graphs in the following 
sense. A graph G is mi-minimal if mi{H) < mi(G) for each proper induced 
subgraph H of G. In the sequel we will use the proposition below. This 
proposition is proved in [6]. 

Proposition 1. If H is an induced subgraph of G then mi(iJ) < mi(G). 
Denote by MIN(fc) the set of all mi-minimal G with mi(G) = k and put 
m(jfc) = max{|F(G)| : G G MIN(ik)}. 

Vertices u and v are called duplicated if N{u) = N{v). li u and v are 
duplicated vertices of a graph G then mi(G - u) = mi(G). Also, deleting 
isolated vertices from a graph G ^ K\ produces an induced subgraph H 
with m\{H) = mi(G). The following theorem was proved in [6]. 

Theorem 1. Let G be a graph without duplicated or isolated vertices and 
mi(G) = k>2. Then \V{G)\ < +k -2. 

As an easy consequence of Theorem 1 we obtain that the set MIN(A:) is 
finite for all A; > 1 and m{k) < 2^“^ -h A: - 2 for all fc > 2. 

In section 2, we establish a better bound on m{k). Namely, we show that 
2(A: — 1) < m{k) < k{k — 1) for any k >2 and conjecture that m{k) = 2k — 2. 

In section 3, we prove NP-completeness of some problems associated with 
the parameter mi(G). Next, we discuss the complexity of the problems arising 
from the theory of general partition graphs [7]. These graphs recently have 
been characterized in terms of covers by universal cliques. We prove that the 
problem NON-UNIVERSAL CLIQUE is NP-complete even when the graphs 
in question are weakly chordal and therefore perfect. 

2 Bounds for m(fc) 

Our first result is the following. 

Theorem 2. 2(fc — 1) < m(fc) < k{k — 1) for every k >2. 

Proof. Let Kk-i be the complete graph with vertex set V{Kk-i) = V = 
{ui,U 2 , . . . , Vfc-i}, and let W CiV = 0, where W = {wi,W 2 , • • We 

define a graph Gk in the following way: V {Gk) =WUV, E{Gk) = E{Kk-i)tJ 
{viWi : i = 1, 2, . . . , A:- 1}. It easy to check that mi(G;fe) = k and mi(G;b -u) < 





391 



k for any vertex u G V{Gk)‘ Proposition 1 implies mi{H) < mi(G) for every 
proper induced subgraph H of G. Thus, Gk € MIN(fc) and m{k) > \V (Ga;)| = 
2{k-l). 

Further, let G G MIN(fc). Then G contains k pairwise distinct maximal 
independent sets /i,/ 2 , . . . ,/a;. For all 1 < i ^ j < A; there are adjacent 
vertices Uij G U and uji G Ij. We denote Ui = {uij : j = 1, 2, . . . , i — 1, i -h 1, 

k 

i + 2, . . . , fc}, [7 = Ui and H = G{U). The set Ui is independent in H 

i=l 

since Ui C li. We extend Ui to a maximal independent set Ji of H, We 
have Ji / Jj when i ^ j because there are adjacent vertices Uij G Ji and 
Uji G Jj. Hence mi(JT) > k. By mi-minimality of G, we have G = H and 
m{k) < \V{H)\ = k{k - 1). The proof of the theorem is complete. 

We conjecture that m{k) = 2k — 2 for any k> 2. 

3 Complexity results 

We shall investigate the complexity of the following problems. 

Decision Problem 1 (MI-MINIMAL GRAPH). 

Instance: a graph G. 

Question: is G a mi-minimal graph? 

Decision Problem 2 (MI-CRITICAL VERTEX). 

Instance: a graph G and a vertex u of G. 

Question: is it true that mi(G - u) < mi(G)? 

A set X C V{G) dominates a set T C V{G) if V C iV[X]. The following 
lemma will be useful for the proofs of Theorem 3 and Theorem 4. 

Lemma 1. Let x be a vertex of a graph G. Then mi(G — x) < mi(G) if and 
only if there exists a maximal independent set J in G - N[x] that does not 
dominate N{x) in G. 

Proof. We denote 

• A = {Ielnd{G) :x^/}, 

• B = {I e Ind(G) : X e I and I \ {x} dominates N{x)), 

• G = {/ G Ind(G) : X £ I and I \ {x} does not dominate N{x)}, 

• A' = {/ G Ind(G -x):ln N{x) ^ 0}, and 
. B' = {I e Ind(G -x):ln N{x) = 0}. 

It is clear that A = A' and / -> / \ {x} is a bijection between B and B'. 
We have 

\AUB\ = |A| + \B\ = lA'l + \B'\ = lInd(G - x)\. 

Therefore mi(G - x) < mi(G) if and only if G 0. Finally, there exists I in 
G if and only if the set J = / \ {x} is a maximal independent set in G - N[x] 
that does not dominate N{x) in G. Lemma 1 is proved. 





392 



Theorem 3. MLMINIMAL GRAPH and MLCRITICAL VERTEX prob- 
lems are polynomially equivalent 

Proof. It follows from Proposition 1 that a graph G is mi-minimal if and only 
if mi(G — ii) < mi(G) for each vertex u of G. Hence MI-MINIMAL GRAPH 
is polynomially reducible to MI-CRITICAL VERTEX. 

Conversely, let G and u be an instance to MI-CRITICAL VERTEX. We 
construct a new graph F by 

• taking two disjoint copies G and G' of G, 

• fixing an isomorphism : V{G) V(G'), v i-> v', and 

• adding new edges v'w for all v' G V(G') and w G 

Now we delete the vertex u' from F, and we denote the resulting graph by 
H. By the construction, Nh{v) = Nh{v') for every vertex v ^ u. This fact 
and Lemma 1 imply that mi{H-x) < mi{H) for every vertex x G V{H)\{u}. 

Thus, F is a mi-minimal graph if and only if mi{H — u) < mi{H). It is 
clear from the definition of H and Lemma 1 that the last inequality holds 
if and only if mi(G -u)< mi(G). In other words, MI-CRITICAL VERTEX 
problem for G and u is polynomially reducible to MI-MINIMAL GRAPH 
problem for H. This completes the proof of Theorem 3. 

Theorem 4. Both MI-MINIMAL GRAPH and MI-CRITICAL VERTEX 
are NP-hard problems. 

Proof. According to Theorem 3, it is sufficient to consider MI-CRITICAL 
VERTEX problem only. 

We shall construct a polynomial-time reduction from SATISFIABILITY 
which is an NP-complete problem (see [2]). An instance of SATISFIABILITY 
is a set C = {ci , C2, . . . , of clauses over the set X = {xi , X2, . . . , Xn} of 
Boolean variables (each clause being an elementary conjunctions of some 
literals over X). The set of literals over X is 

Lx = {a:i,X 2 ,...,Xn}U{xi,X 2 ,...,Xn} 

Question: is there a truth assignment to X that satisfies all the clauses in G? 

Given an instance X, C to SATISFIABILITY, we construct a graph G 
with V (G) = {u, i;} U Lx U G as follows: 

• Ng{u) = {i;} and Ng{v) — {u} U G, 

• the set Lx induces a matching xiXi,X 2 X 2 , . . . , XnXn, 

• the set G is independent, and 

• a vertex y £ Lx is adjacent to a vertex G G if and only if the clause Ci 
includes the literal y. 

We consider G and u as an instance to MI-CRITICAL VERTEX. By 
Lemma 1, mi(G -u) < mi(G) if and only if there exists a maximal indepen- 
dent set J in G - N[u] = G(C U Lx) that does not dominate N{u) = {u} in 





393 



G. Since each vertex of C is adjacent to v and J does not dominate J must 
be a maximal stable set in G(Lx)- Moreover, J dominates C (otherwise J is 
not a maximal independent set in G - N[u]). 

An independent subset of Lx that dominates C defines the following truth 
assignment to X that satisfies all the clauses in C : a literal y is true if and 
only ify e J. Conversely, given a truth assignment to X that satisfies all the 
clauses in C, we can define a required maximal independent set J : a vertex 
y of Lx is in J if and only if the literal y is true. 

Thus, SATISFIABILITY is polynomially reducible to MI-CRITICAL 
VERTEX. The proof of the theorem is complete. 

A clique in a graph G is called universal if it intersects all maximal inde- 
pendent sets of G. We consider the complexity of the following problem. 
Decision Problem 3 (NON-UNIVERSAL CLIQUE). 

Instance: A graph G and a clique C of G. 

Question: Is C is not a universal clique? 

We use a construction as in [1] for the problem NOT- WELL COVERED 
GRAPH. To show that determining if C is not universal is NP-complete, we 
will show that there is a polynomial time reduction from 3-SATISFIABILITY. 
Decision Problem 4 (3-SATISFIABILITY). 

Instance: Collection C — {ci,C 2 , . . . ,Cyn} of clauses on a finite set U = 
{til, iX 2 , . • . , Un] of variables such that |ci| = 3 for i = 1, 2, . . . , m. 

Question: Is there a truth assignment for U that satisfies all the clauses in 
C? 

Theorem 5. NON-UNIVERSAL CLIQUE is an NP-complete problem. 

Proof. One can easily verify the independence and maximality of a proposed 
vertex set / with I DC — 0, showing that the problem is in NP. 

Let C and U be an instance of 3-SATISFIABILITY. We construct a graph 
G on a vertex set 

{ui,tIi,U2,U2,-.-,Wn,Un}U V, 

where V = {t^i, '^ 2 , • • • , ^m}, as follows. For each i = 1, 2, . . . , n form a K 2 
on vertices ui and Ui\ form Km on a vertex set V; if Cj = 

(where each xj^k is some Ui or u^}) make vj adjacent to and Xj, 3 ), 

j = 1,2, ...,m. Clearly, G can be constructed in time polynomial in the 
length of G. 

If there exists a vertex u{u = UiOru = Ui) which is adjacent to all vertices 
in V, then 3-SATISFIABILITY has a trivial solution (all clauses contain a 
common literal). Suppose it is not the case. Hence V is a clique in G. We 
consider G and V as an instance for NON-UNIVERSAL CLIQUE. 

Assume that there is a satisfying truth assignment 



t : {ui,U2,...,Un} ->■ {T,F}. 





394 



Put I = {ui : t{ui) = T} U {ui : t{ui) = F}. Because every Ci contains a 
true literal, each vi is adjacent to a vertex in I. So / is a maximal independent 
set and I nV = 0. We obtain that V is not a universal clique. 

Conversely, suppose that there is no satisfying truth assignment. Consider 
a maximal independent set / in G which does not intersect V. Clearly, I 
contains exactly one of Ui and Ui. Let us define the truth assignment on 

{^1, ^1? ^2? ^2? • • • 5 ^n} 

by letting a literal be true if and only if it is in I. Maximality of I implies 
that each vj is adjacent to a vertex in I. So we construct a satisfying truth 
assignment, a contradiction. Hence V intersects all maximal independent sets 
in G and F is a universal clique. The proof of the theorem is complete. 

A graph is called weakly chordal if neither it nor its complement contains 
a chordless cycle with more than four vertices. It is possible to show that the 
graph G obtained in the proof of Theorem 5 is a weakly chordal graph. Hence, 
NON-UNIVERSAL CLIQUE is NP-complete for weakly chordal graph. Since 
it has been proven by Hayward [4] that weakly chordal graphs are also perfect 
graphs, then we conclude that the problem NON-UNIVERSAL CLIQUE for 
perfect graphs is also NP-complete. 

4 Acknowledgements 

The research of the second author financed by the Institute of Mathematics 
of the NASB within the framework of the State program “Mathematical 
Structures”, and partly supported by INTAS (Project INTAS 00-217). Also, 
the authors would like to thank the referees for their useful comments and 
suggestions. 



References 

1. Chvatal, V., Slater, P. J. (1993) A note on well-covered graphs. Ann. Discrete 
Math. 55, 179-182 

2. Garey M. R., Johnson D. S. (1979) Computers and Intractability. W. H. Freeman 
and Company, San Francisco 

3. Harary F. (1969) Graph Theory. Addison- Wesley 

4. Hayward, R. B. (1985) Weakly trianqulated graphs. J. Comb. Theory. Ser. B 
39, 200-208 

5. Jou, M.-J., Chang, G. J. (1995) Survey on counting maximal independent sets. 
Proc. Second Asian Math. Conf. World Scientific, Singapore, 265-275 

6. Jou, M.-J., Chang, G. J., Lin, C., Ma, T.-H. (1996) A finiteness theorem for 
maximal independent sets. Graphs and Combin. 12, 321-326 

7. McAvaney, K., Robertson, J., DeTemple, D. (1993) A characterization and 
hereditary properties for partition graphs. Discrete Math. 113, 131-142 

8. Moon, J. W., Moser, L. (1965) On cliques in graphs. Israel J. Math. 3, 23-28 





Testing Solution Quality in Stochastic 
Programs 



David P. Morton 

Graduate Program in Operations Research, The University of Texas at Austin, 
Austin, TX 78712, USA, morton@mail.utexas.edu 



Abstract. We describe a statistical procedure for testing the quality of a feasible 
candidate solution for an important class of stochastic programs. Quality is de- 
fined via the so-called optimality gap and the procedure’s output is a confidence 
interval on this gap. We review a multiple-replications procedure for constructing 
the confidence interval. Then, we present a result that allows the procedure to be 
computationally simplified to a single-replication procedure. 



1 Introduction 

We consider a stochastic optimization problem of the form 

z* = min£/(a:,^), (SP) 

where / is a real- valued function measuring the performance of a system of 
interest, x is a decision vector constrained to obey physical and policy rules 
represented by the set X, ^ is a random vector, and E is the expectation 
operator. We denote a solution to (SP) as (x*,z*). Throughout we assume 
/ can be evaluated exactly, given specific values for its arguments. 

Unless the random vector ^ has a small number of realizations (also called 
scenarios) or / has a particularly simple structure, it is usually impossible to 
solve (SP) exactly. For problems in which ^ is of moderate-to-high dimension 
and is continuous or has a large number of realizations, Monte Carlo simula- 
tion is widely regarded as the method of choice for estimating Ef{x^ ^), when 
X is fixed. So, one approach for approximately optimizing (SP) is to sample n 
independent and identically distributed (i.i.d.) observations i = 1, . . . ,n, 
from the distribution of ^ (of course, other sampling schemes are also possible) 
and then solve the approximating problem 

4 = i^Pn) 

xexn ^ ' 

1=1 

Solving (SPn) in place of (SP) is justified by consistency results that 
establish conditions on /, X, ^ and the sampling procedure under which so- 
lutions (^r*, 2 ;*) to (SPn) are optimal to (SP) as the sample size n grows to 
infinity [3,10,15,17]. These consistency results are clearly needed, but they 
are not enough because they tell us nothing about the quality of a solution. 




396 



a:* , obtained by solving {SPn) for finite n. In this paper we first review a 
multiple-replications methodology for establishing the quality of the solu- 
tion a:* obtained by solving (SPn) and then prove a result that allows the 
procedure to be computationally simplified to a single-replication procedure. 

The optimal solution value , 2 ;* to (SPn) plays an important role in estab- 
lishing the quality of a candidate solution. So, before proceeding, we give a 
simple example that illustrates a number of properties of z*. 

Example 1. Define (SP) through X = [-1, 1], ^ ^ A/'(0, 1), and /(x,^) = x^. 
Clearly^ z* = 0 and every feasible solution is an optimal solution to (SP). 
Even though this is a trivial problem we can form the approximating problem 



z 



* 

n 



min 




X 



and we find x* = 1 z/ ^ ^ 

objective coefficient is an average of n i.i.d. standard normals and so z* = 
- \N{0, l/n)|. In this example, z^ has the following properties: 

1. Ez^ < z*\/n negative bias 

2. Ez!l^ < Ez^_^i monotonically shrinking bias 

3. 2 * — > z*,wpl strong consistency 

4 . ~ ^*) — “1^(0? 1)1 non-normal errors 

5. Bz{n) = Ez"!^ - z* = ai/^/n 0(n“^/^) bias. □ 

These first three properties hold much more generally, although some mild 
conditions are needed (as can be seen by replacing X = [-1,1] with X = 5R). 
The fourth condition is not in a form to hold for more general instances of 
{SP). Instead of we usually have i.e., convergence in distribution, 
and the limiting distribution will differ. That said, the result is representative 
of the more general case both with respect to the rate of convergence 

and the “folded” or “projected” normal random variables that arise [2,8,17]. 
The fifth property concerns the rate at which z*’s bias shrinks to zero. When 
(SP) has multiple optimal solutions (as in the example), 0(n“^/^) bias arises 
in a very general setting. However, when (SP) has a unique optimal solution, 
Ef{x,^) is sufficiently smooth, and X satisfies a regularity condition, the 
bias is of order 0{n~^) [18]. 



2 A Multiple Replications Procedure 

A key observation from the discussion in Section 1 is that 2 :* gives a lower 
bound, in expectation, on the optimal solution value 2 ;*. In integer program- 
ming, and other areas of optimization, lower bounds (for minimization prob- 
lems) arise via relaxations of integer constraints and relaxations of other com- 
plicating constraints. Such bounds play fundamental roles in proving optimal- 
ity, or near-optimality, of candidate solutions. Our procedure for stochastic 





397 



programs is analogous except that our lower bounds are statistical in na- 
ture and therefore yield different types of optimality statements compared to 
deterministic problems. 

We will measure the quality of a candidate solution x, e.g., x = x* , by the 
optimality gap^ Ef{x, 0 “ If f^e gap is sufficiently small then x is of high 
quality. Unfortunately, in our setting this gap cannot be computed exactly. 
In earlier work [11], we establish the bias result, < z*, and circumvent 
the issue of non-normal errors for z* via a replications procedure to construct 
a confidence interval (Cl) of the form 



P{Ef{x,0 <z*+e}^l-a. 



( 1 ) 



Here, x £ X is a. candidate solution, Ef{x,^) is its ‘‘true” and unknown 
expected performance measure, e is the (random) Cl width, and 1 — a is the 
confidence level, e.g., 0.95. We summarize below our procedure for construct- 
ing (1) from [11]. 

Procedure MCSP 

Input : Data for {SP). Batch size n, number of batches n^, and n* which is 
the size of the approximating problem used to obtain the candidate solution. 
Confidence level 1 - a and t distribution quantile tng-i.a^ 

Output : Candidate solution x and approximate (1 - a)-level Cl [0, Gug + ^g] 
on Ef{x,^) - z*. 

1. Sample observations ^\ . . . and solve (SPn^) to obtain x 

2. Sample i.i.d. batches , . . . , for i = 1, . . . , 

3. For each 2 == 1, . . . , calculate 



Gi = 






, 1 
— mm — 
xexn 



n 






( 2 ) 



4. Let - Gn,)\ and 

— trig-l,a^g/ 

5. Print (“Candidate solution:” ,x, “Confidence interval on optimality gap:”, 

[0, Grig + ^p]) 



In Step 1, (SPn^) is solved to obtain the candidate solution. However, any 
technique that generates a feasible solution, x £ X, can be used instead. The 
Cl [0, Grig + ^g] on Ef{x,^) — z"^ is inferred from the central limit theorem 
(CLT) 



y/n^ [Grig - EGI^] => iV(0,cr^) as 00 where = varGj^, 

and the fact that EGl^ > Ef{x,^) — z*. 

MCSP has been applied in several settings [1,7,12,14,19-21]. Other related 
work on establishing solution quality for stochastic programs includes [4,16] 
and algorithm-specific work [5,6,13]. 





398 



One of the shortcomings of the MCSP approach is the cost of performing 
multiple replications, i.e., having to solve (say) Ug = 20 instances of (5Pn)? 
can be prohibitive. The technical reason for performing Ug replications is that 
can have non-normal errors. Despite this, we develop an approach in the 
next section that enables us to assess solution quality of x by solving a single 
instance of (5Pn). 



3 A Single Replication Procedure 



In this section we show how solving a single instance of (SPn) yields sufficient 
information regarding a lower bound on z* so that we can make a valid 
statistical inference concerning the quality of a candidate solution x. We will 
assume: are i.i.d. from the distribution of Ep{x,^) < oo Vx € 

X, X is compact, and Ef{x, is continuous on X. We will also assume that 
all limit points of {x* are optimal to (5P). This consistency hypothesis 
holds provided /(*,0 is convex, wpl, and has integrable subgradients [10]. 
An alternative sufficient condition is that X is finite [9]. [?]. should do so. 

Let X* denote the set of optimal solutions to (5P), x* G X*, fn{x) = 
^Er=i =vaxf{x, 0, and sl{x) = 

Let Zq denote the normal quantile satisfying P(X(0, 1) < Za) = 1 — a. We as- 
sume O' < 1/2 and typically take a = 0.05. Finally, for technical reasons that 
will become apparent we let x* solve mina,^x* var/(x,^), i.e., x* is a solution 
with minimum objective function variance among all optimal solutions. 

Because z* is optimal for (SPn) we have z* < /n(^), wpl, Vx G X. Hence, 



P 




> P 
= P 



' fnjxD-Z* 

.Sn{xl)|^/n 

' Jn{xl)-Z* 




By the standard CLT for i.i.d. random variables 



(3) 



lim P 

n—^oo 



fnjxl) - 
[sn{xl)ls/n 






= 1 - a. 



(4) 



Now, is a strongly consistent estimator of var/(x*,^), var/(x*,0 < 

var/(x,^) Vx G X*, and {x*}^i has limit points in X*. Thus, 



lim inf > l,wpl. (5) 

n->oo Sn(xj;) 



Combining (3), (4), (5) and a converging-together result yields: 
Theorem 1. Under the hypotheses stated above 



lim inf P 

n-^oo 



z:: — z 



Sn{x*n)/^/n 



< Za 



> lim P 

n— >oo 



fnjxl) - 2* 

Sn{xl)ly/n 



= 1 - a. 



( 6 ) 





399 



When X* is not a singleton we cannot expect {x* to have a unique 
limit. Hence, general, have a unique limit. This is 

why the “liminf” appears on the left-hand side of (6) instead of “lim.” 

As described above, lower bounds on z* are key to establishing the quality 
of a candidate solution. The importance of (6) is that from it we infer that 
for sufficiently large values of n 

P{z:^- ZaSn{xl)/y/n< z*} ^l-a. (7) 

So, even though z* is not asymptotically normal we can construct a valid 
one-sided Cl on 2 ;* using normal quantiles. Intuitively, (7) holds because the 
lack of normality comes from a projection of normal random variables that 
occurs in a way to make the interval more conservative (see Example 1). 

For many problems the cost of performing multiple replications is the 
primary computational bottleneck in efficiently establishing solution qual- 
ity. In such cases. Theorem 1 will yield computational savings in a revised 
MCSP procedure. In particular, we can use rig = I batch instead of (say) 
rig = 20 batches. Result (6) can to be revised, in straightforward fashion, to 
include the upper bound estimator so that we obtain a Cl on the optimal- 
ity gap and not just a lower bound on 2 *. There are two ways to do this. 
The MCSP procedure in Section 2 uses common random number streams 
(CRNs) in the upper bound estimator and the lower bound estimator in (2). 
Alternatively, we can use independent streams for upper and lower bound 
estimation. We have previously investigated this issue in [7,11,12]. For prob- 
lems in which evaluating /(x,^) is computationally expensive, e.g., two-stage 
stochastic linear programs, using CRNs affords significant variance reduction 
(ranging from a factor of 17 to 4100 in [11]). On the other hand, if /(x,^) 
is relatively inexpensive to evaluate and we can use a dramatically larger 
sample size for upper bound estimation then using independent streams can 
be preferable (e.g., [7,12]). 



References 

1 . M. Bertocchi, J. Dupacova, and V. Moriggia. Sensitivity of bond portfolio’s 
behavior with respect to random movements in yield curve: A simulation study. 
Annals of Operations Research, 99:267-286, 2000. 

2. J. Dupacova. On non-normal asymptotic behavior of optimal solutions for 
stochastic programming problems and on related problems of mathematical 
statistics. Kyhemetika, 27:38-52, 1991. 

3. J. Dupacova and R.J.-B. Wets. Asymptotic behavior of statistical estimators 
and of optimal solutions of stochastic optimization problems. The Annals of 
Statistics, 16:1517-1549, 1988. 

4. J.L. Higle and S. Sen. Statistical verification of optimality conditions for stochas- 
tic programs with recourse. Annals of Operations Research, 30:215-240, 1991. 

5. J.L. Higle and S. Sen. Stochastic decomposition: an algorithm for two-stage 
linear programs with recourse. Mathematics of Operations Research, 16:650- 
669, 1991. 





400 



5. J.L. Higle and S. Sen. Stochastic decomposition: an algorithm for two-stage 
linear programs with recourse. Mathematics of Operations Research^ 16:650- 
669, 1991. 

6. J.L. Higle and S. Sen. Statistical approximations for stochastic linear program- 
ming problems. Annals of Operations Research, 85:173-192, 1999. 

7. A.S. Kenyon and D.P. Morton. Stochastic vehicle routing with random travel 
times. Transportation Science, 2001. To appear. 

8. A.J. King and R.T. Rockafellar. Asymptotic theory for solutions in statistical 
estimation and stochastic programming. Mathematics of Operations Research, 
18:148-162, 1993. 

9. A.J. Kleywegt, A. Shapiro, and T. Homem-de-Mello. The sample average ap- 
proximation method for stochastic discrete optimization. Stochastic Program- 
ming E-Print Series, 1999. http://dochost.rz.hu-berlin.de/speps/. 

10. A.J. King and R.J.-B. Wets. Epi-consistency of convex stochastic programs. 
Stochastics, 34:83-91, 1991. 

11. W.K. Mak, D.R Morton, and R.K. Wood. Monte Carlo bounding techniques 
for determining solution quality in stochastic programs. Operations Research 
Letters, 24:47-56, 1999. 

12. D.P. Morton and R.K. Wood. On a stochastic knapsack problem and general- 
izations. In D.L. Woodruff, editor. Advances in Computational and Stochastic 
Optimization, Logic Programming, and Heuristic Search: Interfaces in Computer 
Science and Operations Research, pages 149-168. Kluwer Academic Publishers, 
Boston, 1998. 

13. V.I. Norkin, G.Ch. Pflug, and A. Ruszczyhski. A branch and bound method for 
stochastic global optimization. Mathematical Programming, 83:425-450, 1998. 

14. E. Popova and D. Morton. Adaptive stochastic manpower scheduling. In 
Proceedings of the Winter Simulation Conference, pages 661-668, 1998. 

15. S.M. Robinson. Analysis of sample-path optimization. Mathematics of Opera- 
tions Research, 21:513-528, 1996. 

16. A. Shapiro and T. Homem-de-Mello. A simulation-based approach to two-stage 
stochastic programming with recourse. Mathematical Programming, 81:301-325, 
1998. 

17. A. Shapiro. Asymptotic properties of statistical estimators in stochastic pro- 
gramming. The Annals of Statistics, 17:841-858, 1989. 

18. A. Shapiro. Stochastic programming by Monte Carlo simulation meth- 
ods. Stochastic Programming E-Print Series, 2001. http://dochost.rz.hu- 
berlin.de / speps / . 

19. B. Verweij, S. Ahmed, A. Kleywegt, G. Nemhauser, and A. Shapiro. The sample 
average approximation method applied to stochastic vehicle routing problems: 
a computational study. (Working paper), 2001. 

20. D.W. Watkins, Jr., D.R Morton, and D.C. McKinney. Monte Carlo techniques 
for estimating solution quality in stochastic groundwater management models. 
In Proceedings of the XII International Conference on Computational Methods 
in Water Resources, Crete, pages 67-74, 1998. 

21. G. Zakeri. Metaneos project: Verify optimization solver. http://www- 
unix.mcs.anl.gov/metaneos/. Argonne National Laboratory. 





Scenario Updating Method for Stochastic 
Mixed-integer Programming Problems 



Guglielmo Lulli^ and Suvrajeet Sen^ 

^ Department of Statistic, Probability and Applied Statistics, University of Rome 
“La Sapienza” - 1-00185 Rome, ITALY, e-mail: guglielmo.lulli@@uniromal.it 
^ Department of Systems and Indust ial Engineering University of Arizona - 
Tucson AZ85721, USA. e-mail: sen®® sie.arizona.edu 



Abstract. In this paper, we propose an approximation scheme to solve large 
stochastic mixed-integer programming (SMIP) problems with fixed recourse. We 
refer to this as the Scenario Updating Method. The algorithm is based on solving 
instances of the problem, which contain only a subset of the scenarios in the sce- 
nario tree. At each iteration, the subset of scenarios is updated by adding only those 
scenarios which suggest a significant potential for change in the objective function 
value. The algorithm is terminated when the potential for change is insignificant. 
Different selection and updating rules are discussed. 

We test the effectiveness of our method on a multi-stage stochastic batch-sizing 
problem, which is solved using a branch- and-price algorithm, at each iteration. The 
quality of the computational results demonstrates the viability of the proposed 
method. 



1 Introduction 

Stochastic mixed-integer programming problems are mixed-integer program- 
ming problems in which some problem data are uncertain. More precisely, the 
problem data are given by a discrete time stochastic process {6}!^! defined 
on some probability space {E^T^V).T = {1, . . . , |T|} is the decision horizon. 
Decisions are based on the information available at that time, i.e. on the set 
of decisions already made and on the outcome of the random variable in the 
previous stages. If = {xi,. . . ,xt) is the vector of all decisions made from 
stage 1 to stage t and == (^i? • • • is the vector of the random variable 
outcomes during the same interval then, a prototypical multi-stage stochastic 
program is given by the following problem: 

min{ci{^i)xi + Qi{xi) : WiXi < hi{^i),xi e Xi}, 



where 

Qt{xt) = +Qt+i{xt+i) : 

Tt+l{i^_^_^)x^ + Wt+iXt+i < ^t+i € Xt+i} 

for t = 1, . . . , |T| - 1 with Q^r\ = 0. Here we assume that is known at 
time t = 1 and denotes expectation with respect to the distribution 




402 



of conditioned on the observation For all realizations of ^ and time 
stages, we suppose that are matrices and vectors of 

conformable dimensions. The set Xt denotes restrictions that require some 
or all the decision variables to be integer. 

In keeping as much with much of the literature, we assume that the ran- 
dom vector ^ has a finite support; that is 5* = , . . . , with probabilities 

This hypothesis allows us to represent uncertainty by means of 
scenarios. In most of the real applications, building a representative scenario 
tree is a crucial task. It calls for compromises between a manageable prob- 
lem size and the desired precision of results. In this area of research, two 
main techniques are used according to the data available. Cluster analysis 
is used when the main random factors have been detected and enough data 
paths can be generated in accordance with a stochastic model. A sampling 
procedure is adopted whenever a well-calibrated stochastic model is avail- 
able. In Dupacova, Consigli and Wallace [4], several methods for generating 
representative scenario trees for multi-stage stochastic programs of a general 
structure are given. Hpyland and Wallace [5], Klassen [6], Kouwenberg [7] 
and Pfiug [10], to mention a few, provide several insights in the generation of 
representative scenario trees in the field of financial applications, one of the 
most prolific area of research. 

At the same time, one of the challenges in stochastic programming is the 
development of efficient algorithms which are able to solve problem with a 
larger and larger number of scenarios. Lpkketangen and Woodruff [8] combine 
the progressive hedging algorithm with a tabu search to solve multi-stage 
SMIP with binary variables. A Lagrangian relaxation for use within a branch 
and bound algorithm for multi-stage SMIP has been proposed by Carpe and 
Schultz [2]. In our paper [9], we propose a branch-and-price algorithm to solve 
special structured multi-stage SMIP problems. 

In this paper, we propose a heuristic which approximately solves large 
SMIP problem with complete fixed recourse. We refer to this method as the 
Scenario Updating Method. The algorithm is based on solving instances of 
the problem, which contain only a subset of the scenarios in the scenario tree. 
At each iteration, the subset of scenario is updated adding those scenarios, 
which imply either a certain degradation or an improvement of the objective 
function. The algorithm is terminated, when no more of such scenarios are 
available to enter in the current scenarios subtree. Different selection and 
updating rules are discussed. We validate our procedure on a multi-stage 
stochastic batch-sizing problem. 

The paper is organized as follows. In § 2, we describe the algorithm, while 
the computational analysis is given in § 3. Finally, § 4 contains conclusions 
and future research. 





403 



2 A Heuristic Method for Complete Fixed-Recourse 
SMIP Problems 

In this section we propose an approximation method to solve the SMIP prob- 
lem with complete fixed-recourse. The complete recourse assumption allows 
us to compute a feasible solution which accommodates all the scenarios of 
the scenario tree once the heuristic one is computed. Such a solution provides 
an upper bound on the optimal solution. The purpose of this procedure is to 
solve models arising in realistic applications. 

The procedure, we present in this paper is motivated by the contamination 
method, first proposed in Dupacova [3]. The idea behind the contamination 
method consists of estimating lower and upper bounds for the value function 
of a multi-stage stochastic program when a new scenario is added to the 
current scenario tree. The term ’’contamination,” refers to the fact that the 
probability distribution P, is contaminated by the probability distribution 
associated with an additional scenario. While the motivation for the method 
lies in post-optimality analysis of stochastic programs, the application to a 
sequential tree generation process is one of its intended purposes. This paper 
extends the ideas of the contamination method in two ways: first, we deal 
with SMIP problems, rather than MSLP problems, and second, we discuss 
and compare explicit rules for algorithm development. In contrast, previously 
reported work (e.g. Dupacova, Consigli and Wallace [4]) has not discussed 
issues related to algorithmic realization of the method in any detail. 

We start the procedure with a subset of scenarios So C 5, where S is the 
set of scenarios. At each iteration k, the algorithm solves an instance of the 
problem, which is composed by the subset of scenarios Sk- The solution of the 
instance, is referred to as the current solution. Then, the subset of scenarios 
is updated by adding those scenarios, which imply a significant change in 
the objective function. The algorithm is terminated, when no more of such 
scenarios are available to enter in the current scenario subtree. To verify if 
a scenario can be elected or not to enter in the current scenario subtree, we 
compute an upper bound of the current objective function, resorting on the 
complete recourse hypothesis. Under this hypothesis, any scenario solution is 
fegisible for all the scenarios, thus allowing to compute a feasible solution for 
the original scenario tree. 

The Scenario Updating procedure is summarized below. 

Initialization Start selecting a subset of scenarios of the scenario tree. 
Step 1 Solve the “reduced” problem with an appropriate algorithm. 

Step 2 Compute an upper and a lower bound on the value function induced 
by adding a scenario not in the current scenario subtree. Those scenarios, 
which if added to the current scenario tree imply a certain change in the 
objective function, are candidates to enter in the next subtree. 

Step 3 If there are no scenario candidates then stop, otherwise add those 
scenarios or some of them to the current scenario subtree, and go to Step 
1 . 





404 



When implementing the proposed algorithm, several issues have to be ad- 
dressed. Most of these have an impact on the trade-off between solution time 
for each iteration and number of iterations. The first issue pertains to the 
number of scenarios to use in the initial subtree, and which ones. At the 
beginning of the procedure, it is desirable to have a subtree whose scenarios 
are most ’’representative.” A ’’representative scenario” is one whose solution 
can be used for other scenarios, without incurring very high penalty costs. 
Effective scenario selection assures a fastest convergence of the procedure. 
Several decision rules may be developed for this initial task. However, it is 
not possible, a priori, to decide which one is most representative. Analogous 
considerations arise when deciding which of the candidate scenarios should 
be used to update the scenario tree. In the following section we will discuss 
different selection rules. 

Finally, for Step 1 we need to specify the algorithm to be adopted for 
the solution of the restricted problem. As in many successive approximation 
methods, there is a fair amount of art in deciding the appropriate quality 
of solution to require in Step 1. Obtaining an optimal solution (for Step 
1) in each iteration may involve greater computations per iteration, than 
using approximate solutions. In this case, we may seek progressively better 
approximations, which ultimately provide an optimal solution to the entire 
(unreduced) problem. 



3 Computational Analysis 

Both a tuning analysis and performance evaluation of the algorithm has been 
executed on the stochastic batch-sizing problem. The stochastic batch-sizing 
problem belongs to the class of economic lot-sizing (ELS) models, in which 
the demand, production, inventory and set-up costs are uncertain problem 
parameters evolving as discrete random variables. Furthermore, production 
takes place in a batch mode, i.e. production level is a multiple of the batch 
size and consequently the corresponding decision variables are discrete. In 
[9], a detailed description of the problem is given. 

In our computational analysis we use randomly generated instances. We 
generate different sets of problems by varying the number of time periods 
(stages) of the system. As for scenario trees, we generate binary trees with 
conditional probability for any branch being pn^ and the other branch having 
conditional probability 1—pn where n denotes a node of the tree. Here pn is 
chosen from the uniform [0,1] distribution. By choosing alternative values of 
Pn? as well as cost and demands we can generate different problem instances 
corresponding to any given planning horizon. 

We first evaluate different rules for selecting the initial subset of scenarios 
So- We analyze the dissimilarity, the highest probability, the random and 
the mixed rules. The mixed rule selects scenarios using all the others. The 
computational results, for a seven-stage class of instances, are given in the 





405 



following table. For each scenarios selection rule, the following statistics are 
reported: the computational time in seconds (Time), the objective function 
value (O.F. value), the number of iterations of the algorithm (It.s) and the 
number of scenarios accommodated in the approximated solution (SG). The 
best, worst and average (avg) instances are those instances with reference to 
the optimal solutions computed using a branch-and price algorithm (B&P) 
proposed in [9]. The computational time and the objective function value of 
such optimal solutions, have the following values: 3227, 16505 (avg) - 1406, 
96431 (worst) and 3890, 8221 (best) respectively. In the Table 1, the first set 



Table 1. Scenario selection rules 



Mixed 



Highest Probability 



Dissimilarity 



Time 


O.F. 


It.s SG 


Time 


O.F. It.s SG 


Time 


O.F. It.s SG 


(secs.) 


value 




(secs.) 


value 


(secs.) 


value 



210 


8173 


8 


30 


607 


8247 


9 


33 


686 


8318 


7 


30 


1461 


95097 


21 


45 


1450 


94595 


19 


44 


2465 


94064 


19 


42 


791 


16383 


11 


36 


872 


16304 


10 


34 


1066 


16365 


9 


31 


124 


8151 


5 


34 


230 


8255 


4 


35 


544 


8305 


5 


32 


782 


95133 


7 


48 


501 


94563 


6 


44 


1369 


93926 


5 


43 


482 


16384 


5 


37 


379 


16300 


4 


35 


650 


16364 


4 


31 



of three rows refers to using a single-scenario updating rule and the second 
set of three rows refers to the multiple-scenario updating rule. The three 
rows of both the sets, refer to the best, the worst and in the average instance 
respectively. 

According to these results, the multiple-scenario selection with a mixed 
selection rule provides better results for the problem tested. To solve instances 
of larger dimensions, in term of number of stages of the problem, we combine 
cplex branch-and-bound (B&B) method and the B&P algorithm. In fact, 
B&B provides good feasible solutions in a reasonable amount of time, which 
are then used to generate the initial columns in the B&P algorithm. 

Table 2 reports the computational results for eleven-stage instances. In 
particular the following statistics are given: solution time in seconds, ob- 
jective function value, number of iterations executed (of B&B and of B&P 
respectively) and the number of scenarios accommodated in the solutions. 
Even though the computational time is high, it is relatively small consider- 
ing the dimension of the instances solved. It is important to note, that the 
number of scenarios accommodated in the solutions is much smaller than the 
number of scenarios in the scenario tree. 





406 



Table 2. Instances with 1024 scenarios 



Instance 


Time (secs.) 


O.F. value 


It.s 


SG 


I 


14659 


15341 


4/2 


68 


II 


7673 


13517 


2/1 


107 


III 


7895 


14687 


2/1 


130 


IV 


8245 


13976 


2/1 


145 


V 


8733 


14363 


4/2 


120 



4 Conclusions 

In this paper, we propose a heuristic, which solves instances with a large 
number of scenarios of SMIP problems with complete recourse. The quality 
of our results suggests a deeper investigation on the scenario selection rules 
and on stopping criteria. On this subject, the computation of stronger bounds 
would allow the solution of larger instances. 

References 

1. Birge J.R., Louveaux F. (1997) Introduction to Stochastic Program- 
ming. Springer, Berlin Heidelberg. 

2. Car0e C.C., Schultz R. (1999)Dual Decomposition in Stochastic Integer Pro- 
gramming.Operations Research Letters 24 37-45. 

3. Dupacova J. (1995)Postoptimality for Multistage stochastic linear programs. 
Annals of Operations Research 56, pp. 65-78. 

4. Dupacova J., Consigli G. and Wallace S.W. (2000) Scenarios for Multistage 
Stochastic Programming. Annals of Operations Research 100 25-53. 

5. Hoyland K., Wallace S.W. (2001) Generating scenario trees for multistage deci- 
sion problems. Management Science 47 295-307. 

6. Klaassen P. (1998) Financial Asset-Pricing Theory and Stochastic Programming 
Models for Asset /Liability Management: A Synthesis. Management Science 44 
31-48. 

7. Kouwenberg R.R.P. (2001)Scenario generation and stochastic programming 
models for asset liability management. European Journal of Operational Research 
134 51-64. 

8. L0kketangen A., Woodruff D.L. (1996) Progressive Hedging and Tabu Search 
Applied to Mixed Integer (0,1) Multi-stage Stochastic Programming. Journal of 
Heuristics 2 111-128. 

9. Lulli G., Sen S. (2002) A Branch-and-Price Algorithm for Multi-stage Stochastic 
Integer Programming with Application to Stochastic Batch-Sizing Problems. 
Submitted for pubblication. 

10. Pflug G.Ch. (2001)Scenario tree generation for multiperiod financial optimiza- 
tion by optimal discretization. Mathematical Programming 89 251-271. 





Pricing of Multidimensional Resources 
in Revenue Management 
(Multidimensional Dynamic Stochastic 
Knapsack Problem) 



Jens Feller 

University of Dortmund, Fakultaet WiSo, Fachgebiet Operations Research und 
Wirtschaftsinformatik, j .feller@wiso.uni-dortmund.de 



Abstract. Revenue management deals with selling a limited amount of a perish- 
able resource. This resource becomes valueless at a known point of time (deadline). 
There are many papers dealing with one-dimensional resources (e. g. seats or ho- 
tel rooms) using different approaches as linear programming or Markov decision 
processes which consider the problem as a dynamic stochastic knapsack problem 
(DSKP). 

In this article we address the question of choosing an optimal price of these 
resources in consideration of the remaining time, the resources left and the expected 
demands and focus on the characteristics of multidimensional resources which often 
have to be taken into account, e. g. when transporting freight with limitations to 
volume and weight or when pricing package holidays as a simultaneous selling of 
seats and rooms. 

We present a model using the DSKP-approach for maximizing the expected 
revenue with a given deadline and a multi-dimensional resource when the distri- 
bution functions of the demands and of the offered revenue are known and define 
conditions of a threshold policy that are necessary for any optimal policy. 



1 Introduction 

Revenue Management was first used to assign problems in airline industries 
(e. g. selling of seats). It has become more and more important in account of 
further research results, its practical successes and its wide fields of applica- 
tions. 

It addresses problems that occur when selling items that will become 
worthless at a known point in time (e.g. seats of aircrafts, hotel rooms). 
Some of the most important issues are overbooking (selling more items than 
available if it is known that some customers will not use their tickets), fleet 
assignment, defining the size of ticket or price classes or pricing the items 
that are going to be sold. McGill/ van Ryzin give a good overview of these 
topics in [3]. 




408 



In this article we address the question of accepting or denying requests for 
a multi-dimensional resource. Important applications for multi-dimensional 
resources are e. g. freight (volume, weight, containers, etc.) or items that 
depend on each other (seats on an aircraft and hotel rooms at the destination 
at vacation, tickets and hotels rooms at special events and so on). Obviously 
bad choices may result in being out of stock when getting lucrative requests 
or in denying too many requests and keeping unsold resources at the deadline. 

We are going to maximize the revenue when selling a multi-dimensional 
perishable resource. There are requests /arrivals (offered price and demanded 
amount of resource) at different points in time. We have to decide on each 
request if we accept it (take the money and give away the requested resource) 
or deny it (keep the resource). 

If we would know all upcoming requests at the beginning the problem 
would be to pick those arrivals that maximize our revenue without exceeding 
the available items. 

In our particular problem these arrivals are stochastic and not known in 
advance. At each time our information consists of the amount of available 
items and the remaining time to sell those. Furthermore we assume that we 
know the distributions of the upcoming arrivals of demands. 

For one-dimensional resources (e. g. seats or hotel rooms) this problem 
is addressed by the articles of Kleywegt/Papastavrou ([1] and [2]) and van 
Slyke/ Young [4]. Both model the problem as a dynamic stochastic knapsack 
problem (DSKP). Van Slyke and Young give a hint how to apply their ap- 
proach for a multi-dimensional resource. 

In this article we will define a model for a multi-dimensional resource. First 
we will define the assumptions and variables of the model. We will name the 
variables analog to [2]. Next we will show some properties of these variables 
and give conditions that are necessary for maximizing the expected revenue. 
Afterwards we give some results from a similar approach with discrete periods 
of time. 



2 Model 

We are going to maximize the revenue of selling a resource of fc € N dimen- 
sions and an initial amount of N\ G N§ of this resource. Let T € (0, oo) be 
the deadline for accepting these items. The remaining resources at T cannot 
be sold anymore and will be worthless. 

Let { describe the random arrival times and J\f = {no, ni , . . . , n/c} € 
(Nq)^'^^ the set of possible amounts that may be asked for. Each request 
consists of such an amount and a price offered for these items. With the 
state space fi the random variables Si \ Q M and : i? — M give 
the asked amount and the offered price of the ith arrival (revenue or re- 
ward of this arrival). We assume the marginal probability distributions of Si 
as Fsi(x) = F{Si < x} and the conditional distribution of Ri given Si as 





409 



Fr\s{x) = P{Ri < x \ Si = S} are known. In this article we will not use a 
discount rate for the revenues. 

To maximize the revenue of the accepted arrivals we have to decide for 
each arrival whether to accept or deny it. Let the decision variables Di be 1 
if we accept the ith arrival. If we deny it it will be 0. 

To ensure that we do not sell more resources than available we define 
IIc{N,t) as the class of admissible policies with resource N at time t. The 
expected reward when using an admissible policy tt is given by Vq (f) and 
the optimal expected revenue by Vq(A^, f) when N resources are left. Ni is 
the amount of available resources before the decision about arrival i, that is 
Ni = N,-Y:rJ^DjSj. 

By those definitions we get the set of arrival times 

< ^2 < • * • < oo}, the set of rewards TZ = : Ri : fi R} and 

the set of the demanded resources S = : 5^ : i? J\f}. The set of 

decisions is given by V = : Di G {0,1}}. Therefore we can define 

the admissible policies if there has not been an arrival in t by IIciN, t) = {n : 
AxTlxS ->V : T,{i:t<Ai<T}^i^i < with {DJ} = 7r({Ai}, {i?*}, {5*}) 
as the decisions when using policy tt. 

The expected reward using policy tt is then given by Vq {t) = 
^\Y^{i-t<Ai<T}^i^l\ while the optimal reward is defined by Vq(A/^, t) = 
S^P7r€iIc(iV,t) W* 

To handle arrivals in t we define IIc{N^t) = {tt : A x TZ x S V : 
Z{i:t<A,<T}^Si < N}, V^it) = E[Z{,.t<A^<T}RiDf] and V^(N,t) = 
^^P 7 rG 7 Tc{iv t) (^)- there is an j with Aj = t the decision variable 
must be 1 to accept or 0 to deny this arrival. 

3 Threshold Policy 

The following properties guarantee a ’good’ behavior of the threshold we will 
define later. We will say N < N' for some N, N' G N§ if each component of N 
is less or equal than the corresponding component of AT' and AT ^ iV' if there 
is at least one component of N greater than the corresponding component of 
N' {> and ^ analog). 

First we show that there are always admissible policies and that increasing 
our amount of resources can lead to more admissible policies. 

Lemma 1 

Let ATGNg and^ G [0,T). 

a) Then there is J7c(AT, t) / 0. 

b) For each N' G Nq with N' > N there is IIc(N,t) C IIc{N',t). 

Proof: a) The policy ttq of denying all requests with = 0 Vi is admissible 
for all resources AT G Nq and t G [0, T) with (AT, t) = 0, hence IIc{N, t) ^ 
0 . 





410 



b) Suppose 7T is some policy tt E UciN ^ t ) for an arbitrary iV E Nq 
t E [0,T]. Then for all AT' e N§ with N ' > N follows from the definition of 
that Zl{i:t<Ai<T} <N<N' and tt E IIc { N ', t ). □ 

Next we show that the expected revenue of an arbitrary policy and 
the optimal expected revenue V * are always defined. 

Lemma 2 

Let AT E t E [0,T) and < oo 

a) If 7T E nc { N , t ), then V ^{ t ) = ^ i ^ i ] ^ 

b) VQ{N,t) = SUp^^77g(^^^) ®E{i:t<Ai<T} ^i^i] ^ ®>0 • 

Proof: a) Let tt be some policy tt E IIc { N ^ t ) with resource AT E Nq and 
t E [0,T). Then we know with the assumption oo > \M > 

^E{t:f<y4j<T} > eE{ i-.t<Ai<T} l-Ri|]>EE{ i:t<Ai<T} Dim = 

nZii:t<A~<T)\DI Ri\]>E[\Ei 

i:t<Ai<T} -^»l] - ieE{,«^,<t} Dim 

> 0 . 

Thus VSit ) = E[Z{i:t<A,<T} DI Ri] e K. 

b) Prom Lemma 1 we know that VQ{N,t) > = OVAT > 0. Further- 

more V ^{ N , t ) = V ^{ t ) = EE{i:t<Ai<T} Ri^f] < 

EE{i:t<yli<T ARi>0} Ri]<E[E{r.t<A,<T}\m<^-0 

Furthermore the optimal expected revenue is non-decreasing in the amount 
of available resources. 



Lemma 3 For each AT, AT' e with AT < AT' and t E [0, T] follows 

VQ{N,t) < that is V^iN^t) is non-decreasing in N. 

Proof: May N,N' and t E [0,T) be arbitrarily chosen with N' > N. 
We know from Lemma 1 that iJc(AT,t) C 77c (AT', t). 

Let ^ = 77c(iV',t) \ nc{N,t). Then VS(N',t) = V^it) = 

SUP„6i7c(JV,«)U4 Vc (*) = max {sup^gflT^(;v,t) (*)- sup^e/i (*) } > 

SUP;r€JIc(JV,t) Vc(t) = WO-D 

Now we define a class of admissible policies by a threshold r*. Each arrival 
with a greater revenue than r* will be accepted, each arrival with a lower 
revenue will be denied by this policy. We will show that each optimal policy 
must be such a threshold policy. 



Definition 1 

Let the threshold r* : Nq x x [0, T) -4 E>o U {oo} be defined by 
r 

Then we call a policy tt' a r*-threshold policy with the remaining resources 
N at time Ai if for each arrival {Ai, R{, Si} with t < Ai <T (and a positive 



^(N,sd) 






VS{N,t)-VS(N 



c 

00 



sd) ] s < N 

;s^N. 



(1) 





411 



probability of appearance) the implications Ri > r*{N,Si,Ai) Df = 1 

and Ri < r*{N, 5^, Ai) ^ Df — 0 are true. 

The class of all these r* -threshold-policies is IlQ{N,t) (and IlQ{N^t) if 
all Ai / m). 

Remark a) The values of r* are nonnegative (follows from Lemma 1 to 3). 

b) The policies are not different except of realizations where there is at 
least one Ri = r*(AT, 5i, Ai). These arrivals can be accepted or denied. 

Lemma 4 

IlQ{N,t) C Uc{N,t) with N > 0 and t G [0,T], that is all r*-threshold 
policies are admissible. 

Proof: Let n G N be the number of arrivals after t and it' G 77q(AT, 
some r* -threshold-policy. We will show by induction that after every decision 
Nn > 0, that is we do not sell more resources than N. 

i) n = 1: Consider arrival (Ai,5i,i?i) with A^i = N. If Si ^ Ni then 
r*{AuSi,Ni) = oo> Ri and = 0. Thus N 2 = Ni = N > 0. If Si < Ni 
both Di = 0 and = 0 lead in every dimension to a nonnegative stock of 
resources as A ^2 > ^1 - S'! > 0. 

ii) n -> n -h 1: 7 t' leads to admissible decisions for the first n arrivals: 
> 0. If there is an arrival (^4^+1, 5n+i, i?n+i) again there are two 

cases. Sn-\-l ^ -^n+l ^ ^ {An-^lj Sfi-l-l^ Nn-\-l) = 00 > i?n+l ^ -^n+1 — 0 ^ 
AT^_i _2 = Nn+i > 0, the decision is admissible. If 5n+i < Nn+i again both 
accepting and denying are admissible as A^n +2 > Nn-\-i - Sn+i > 0. 

Hence tt' never sells more resources than available, the policy is admissible. 

□ 

Proposition 1 

Let 7T* G IIciN.t) for some N and t be an optimal policy, that is V^* (t) = 
VS(N,t). Then tt* g n'c(N,t). 

Proof: Let tt G IIc{N,t) be some optimal policy with tt ^ i7Q(iV, t). Thus 
there is at least one realization with positive possibility and ani :t < Ai <T 
that Df 7 ^ jDf, Ri ^ r*(iVi, 5^, and Si < Ni Vtt' G iT^(iV, t) (otherwise 
TT would be a r* -threshold-policy). Consider the following two possible cases 
for the smallest of these i: 

i) r* -threshold-policy accepts and tt denies: Df = 1 and = 0 ^ Ri> 
r*{Ni,SuAi) and V^{t) = 0 + ^(t) < VS{Ni,t) = r^{Ni,Si,Ai) + V^{Ni - 
5i,t) < Ri-\- VQ{Ni - Si^t) = VQ{Ni,t) showing that the denial of tt is not 
optimal. 

ii) r* -threshold-policy denies and tt accepts: Df = 0 and Df = 1 => /ii < 
r*{Ni,Si,Ai) andl^(t) = Ri±VS{t) < Ri + V^iNi-Sut) < r*{Ni,Si,Ai) + 
VQ{Ni - Si, t) < VQ{Ni,t) = V^{Ni,t), accepting is not optimal. 

In both cases the decisions that do not follow the r* -threshold-policy lead 
to an expected revenue that is below the optimal expected revenue, hence tt 
cannot be optimal. □ 





412 



Remark Under further assumptions it can be shown that a policy is optimal 
if it is a r*"threshold policy (e. g. if there are at most n € No further arrivals 
until T, by induction). More general assumptions are under current research 
by the author. 

4 Extensions and Conclusions 

In the last section we defined conditions for a threshold-policy that are nec- 
essary for policies creating an optimal expected revenue in our model. Unfor- 
tunately we do not have a formula to calculate the optimal revenues V* that 
define the optimal thresholds. 

If we use discrete time periods in our model above and choose these pe- 
riods small enough to assume there is at most one arrival within one period 
then we can define an algorithm for calculating the optimal revenues V*{N, t) 
and the corresponding thresholds by a dynamic programming approach. This 
discrete model and the following observations are under current research by 
the author. 

Depending on the current amount and the demand for the different re- 
sources the threshold may depend on all or only some of the demanded re- 
sources. E. g. if an arrivals demands one critical and one non critical resource 
the threshold will depend mainly (or only) on the critical resource. But the 
non critical resource might become critical by a high demand. There are ex- 
amples where the thresholds react very sensitively to changes of the critical 
resource. 

Another interesting observation is behavior of these thresholds depending 
on the remaining resources. In most cases a raising of the available resources 
leads to decreasing prices/thresholds (or to constant prices if there are plenty 
of this resource available). But in some cases several ’jumps’ (up and down) 
can be observed when the available resource are raised (especially when de- 
mands for huge amounts of some resources with high revenues are possible, 
e. g. group flights, etc.). 

References 

1. Kleywegt A., Papastavrou J. (1998) The Dynamic and Stochastic Knapsack 
Problem. Operations Research 46, January /February, 17-35 

2. Kleywegt A., Papastavrou J. (2001) The Dynamic and Stochastic Knapsack 
Problem with Random Sized Items. Operations Research 49, January/February, 
26-41 

3. McGill J., van Ryzin, G. (1999) Revenue Management: Research Overview and 
Prospects. Management Science 33, May, 233-256 

4. van Sylke R., Young Y. (2000) Finite Horizon Stochastic Knapsacks with Ap- 
plications to Yield Management. Operations Research 48, January /February, 
155-172 





A Note on Quantitative Stability and 
Empirical Estimates in Stochastic 
Programming 



Vlasta Kankova^ and Michal Houda^ 

^ Institute of Information Theory and Automation 
Academy of Sciences of the Czech Republic 
Pod vodarenskou vezi 4, 182 08 Praha 8, Czech Republic 
kankova@utia. cas. cz 

^ Charles University, Faculty of Mathematics and Physics 
Sokolovska 83, 186 75 Praha 8, Czech Republic 
houda@csmat.karlin.Tnff. cuni. cz 



Abstract. The paper deals with a stability of stochastic programming prob- 
lems considered with respect to a probability measure space. In particular, 
the paper deals with the stability of the problems in which the operator of 
mathematical expectation appears in the objective function, constraints set 
is “deterministic” and the probability measure space is equipped with the 
Kolmogorov or the Wasserstein metric. The stability results are furthermore 
employed to statistical estimates in the stochastic programming problems. 
Some results on a consistence and a rate of convergence are presented. 

1 Introduction 

Prom the mathematical point of view many applications correspond to de- 
terministic optimization problems depending on a probability measure. To 
introduce this type of the problems let (i7, 5, P) be a probability space, 
= [^i(^>d), . . . , ^s(cd)]) be an 5-dimensional random vector defined 
on (i7, 5, P), F{:= F{z)^ z G R^) be the distribution function of Pp 
denote the probability measure corresponding to P, Zp C denote the 
support of Pp. Let, furthermore, p(x, z) be a real-valued function defined on 
X P^, X C be a nonempty set. The symbol P’^, n > 1 is reserved for 
the n-dimensional Euclidean space. 

A rather general above mentioned problem can be introduced in the form. 
Find 

ip{F) = MiEpgix, ^)|a; € X)}. (1) 

(The symbol Ep denotes the operator of mathematical expectation corre- 
sponding to P.) 

To solve the problem (1) the complete information on P is necessary. 
However in applications very often at least one of the following cases happen: 
P must be replaced by its statistical estimate; P must be (for numerical 




414 



difBculties) replaced by some simpler one; the actual distribution function 
is a little modified F. Consequently, it is suitable (or even necessary) to 
investigate the stability w.r.t. the probability measures space and statistical 
estimates. In the literature, a great attention was paid to these both problems. 
We can recall e.g papers [1], [4], [7], [8], [11]. 

2 Problem Analysis 

To investigate the stability of the problem (1) we can state the assumptions 
under which there exists functions mi , m 2 defined on such that 

\ip{F)-ip{G)\ < miidsiPp. Pg)), \\xiF)-x{G)\\ < m^idsiPp, Pg)), (2) 

where 

x{F) = argmin{Epg{x, Ok ^ (3) 

in the case when x{F) is well-defined, G is s-dimensional distribution function 
“near” to F and dg is a “suitable” metric; say Kolmogorov or Wasserstein. 
It is known that the Kolmogorov metric is defined by 

dx(F, G) - dK{Pp, Pg) = sup \F{z) - G{z)\, (4) 

To define the Wasserstein metric dwi{F, G), let V(R^) denote the set 
of all (Borel) probability measures in R^. If Mi(R^) = { 1 / G V{R^) : 
f |l 2 :||i/(dz) < 00 } and X>(z/, /x) denotes the set of those measures in P(R^ x 

R^) whose marginal measures are u and /x. 

dwiii', li) = inf{ J \\z - z\\K{dz x dz) : k G V{v, /x)}, v,iiG A4i(i?^). 

R^xR^ 

The both metrics “measure” the distance between two probability mea- 
sures. However, a “strange” situation can happen. 

Example 1. Let s = 1. Let, moreover, 6: > 0 and a G R^ he arbitrary and 

F{z) =0 for z G (- 00 , a), G(z) = F{z - e), for z G (- 00 , -foo) 

1 z G (a, -hoo) 

then dwi (F, G) = e and d^(F, G) = 1. 

Example 2. Let s = 1. Let, moreover, M < 0, £: G (0, 1) be arbitrary and 

= - z-(k+fj for 2 e (- 00 , M) 
e z = M 

arbitrary z G (M, +cx3), 

G{z) = 0 2 G (- 00 , M), 

F{z) z G (M, + 00 ). 

Evidently, dwi {F, G) = +oo, {F, G) = sup |F( 2 ) - G( 2 )| = e. 





415 



3 Stability Results 

To recall the stability results we introduce the system of the assumption 

A.l a. g{x^ z) is a uniformly continuous function on X x 

b. for every x e g{x^ z) is a Lipschitz function of z £ with the 
Lipschitz constant L not depending on x £ X, 

c. X is a convex set and simultaneously g{x, z) is a strongly convex 
function on X. 

Proposition 1. [6] Let S > 0 be arbitrary, X C be a nonempty, com- 
pact set. If A. la, A. lb are fulfilled, then there exist F^{z), Fg{z), z £ Es 
(Fs(-) < Fg{-)) such that if G is s-dimensional distribution function fulfill- 
ing the relations 

G{z) = (F/(z), F^{z)), z e i?*, Pf,Pg G Mi{R^), 



then 

\<p{F) - ip{G)\ < SL. 

If, moreover, the assumption A,lc is fulfilled, then also 

\\x{F)-xiG)r<UL. 



Employing the Kolmogorov metric it follows from Proposition 1. 
Theorem 1. [5] Let X C R^ be a compact set, a finite exists. If 

1. the assumptions A. la and A, lb are fulfilled, 

^ / / / 

2. Zp = n > 0? < S’? i ~ 1? 2, 5 and, moreover, 

there exists a constant > 0 such that h{z) > 'd for every z £ Zp, 

3. G{z) is an arbitrary s-dimensional distribution such that 



Zg C Zf{S' ) for s' = 2 i , 



2( <mm(Cj-Cj), 



then 

If, moreover, the assumption A.lc is fulfilled, then also 



64 



||a:(F)-a:(G)||2 < - Ly/^{ 



,2d K{F, G) 
1? 






(Zp{S) denote the S -neighbourhood of the set Zp, the symbol h is reserved 
for the probability density corresponding to Pp.) 





416 



Theorem 2. Let X be a nonempty compact set^ Pp G Mi{Es). If the as- 
sumptions A. la and A. lb are fulfilled, Pq G Mi(R^) is an arbitrary, then 

\^{F)-ip{G)\<LdwAEG). 

If, moreover, the assumption A.lc is fulfilled, then also 

\\x{F) - x{G)f <- Ldw^F, G). 

Q 

Theorem 2 is only a little generalized the assertion of Romisch and Schulz 
proven originally for the complete linear recourse problems (see e.g [3]). 
Example 3. Let 5 = 1. Let, moreover, 5 G (0, 1) and a G be arbitrary. If 

F{z) = 0 for z ^ (- 00 , 0) 
z ^ G (0, 1) 

1 z G (1, +oo) 

G{z) = F{z - 5) z ^ (- 00 , H-oo), 



then = 1 and simultaneously dwi{Fi G) = d/c(F, G) = S The upper 
bounds given by the Wasserstein and the Kolmogorov metrics are the same 
for every g fulfilling the system of assumptions A. la and A. lb. 

Example 4. Let 5 = 1. If iV is an odd natural number {N > 3), F{z) := 
Fn{z), G{z) :=Gn{z), N = 5,7, ... successively. 



Fn{z) = 0, 



N-l ] 
2 J 



2(N-1)’ 

I + 2(N-3) 

1 , 



2 e (-00, 0), 

V-: 

2 



^ e (0, ^), 



r - ^+ 3 \ 

^ \ 2 ’ 2 



^],2e(^,AT), 



z G {{N, +(X)), 



Gn{z) = Fn{z), 2 € (- 00 , ^)U (^, +oo), 

1 (E^ E+z, 

then dw^ {F, G) = sup \F{z) - G{z)\ = dK{F, G) = 

Z 

Consequently, the upper bound given by the Wasserstein metric is |L. How- 
ever, since '•= '& — f ( F- T ) ’ upper bound given by the Kolmogorov 

metric is L{N ~ 1). 



4 Relationship between Stability and Statistical 
Estimates 

If is a sequence of 5-dimensional random vectors with common 

distribution function F, Fn denotes a “good” statistical estimate of F deter- 
mined by and F is replaced by F^, then we obtain (under rather 

general assumptions) “good” estimates (^(Fjv) of ^p{F), N = 1, 2, .. . 

To present the corresponding assertion we introduce the assumptions 





417 



B.l a. is a sequence of 5-dimensional random vectors with com- 

mon distribution function F, 
b. Fn is a statistical estimate of F determined by 

Evidently, we can obtain. 

Corollary 1. Let X C En be a compact set, t > 0, a finite exist If the 
assumptions 1, 2 of Theorem 1 are fulfilled an if moreover F/v(-) fulfills the 
system of the assumptions B. 1 such that 



ZFr,CZ{S), S = 2( 



2dx(F, Fiv) 1 



'd 



o^ 2dK(F, Fn) ^ 



<min{Cj-Cj) a.s., 



then 

PMF) - >t}< > t} 

Corollary 2. Let X be a nonempty compact set, Pp G M\{Es), t > 0. If 
the system of the assumptions A. la and A .lb are fulfilled, and if, moreover, 
the system of the assumptions B.l is fulfilled, Pfn G Mi{R^)a.s., then 



PMF) - ^{Fn)\ >t}< P{LdwAF, Fn) > t}. 



Employing the assertion of Corollary 1 and the known properties of the 
Kolmogorov metric we can obtained (for “empirical estimates”) the upper 
bound independently on Pp (of course under some additional assumptions); 
see e.g. [2]. In special case we can even employ the limit Kolmogorov dis- 
tribution function; see e.g. [9]. In the case of Wasserstein metric this upper 
bound depend on the distribution function Pp; see [9]. However, the following 
assertion can be proven. 

Theorem 3. [5] Let X be a compact set, a finite Ep^ exist. If the assump- 
tions 1, 2 of Theorem 1 and the assumptions B.la and B.lb are fulfilled and, 
moreover, 

p{N^s g^p \F{z) - Fn{z)\ > t} ->(jv-y+oo) 0 for a > 0 and every t > 0, 
P{Zf, C z^(?hpJ£MvM)} 1, 

then 

P{N'’\ip{F) - vp(Pjv)] > t} ^(^^+oo) 0. (5) 

Theorem 4. Let X be a nonempty compact set, Pp G t > 0. If 

1. the system of the assumptions A. la, A. lb are fulfilled, 

2. is sequence of independent identically distributed s-dimensional 
random vectors with common distribution function F, 

3. Fjsf is the empirical distribution function determined by 





418 



then 

P{\(f{F) - (p{Fn)\ -^{n^+oo) 0} = 1 

Remark 1. Of course, the corresponding results can be proven (under the 
assumption A.lc) for the astimate x{Fn) of x{F). 

Acknowledgement: This research was supported by the Grant Agency 
of the Czech Republic under Grants 402/02/1015, 402/01/0539 and and by 
Grant 436TSE113/40of the Deutsche Forschungsgemeinschaft. 

References 

1. Dupacova, J. and Wets, R. J.-B. (1984): Asymptotic behaviour of statisti- 
cal estimates and optimal solutions of stochastic optimization problems, Ann. 
Statist., 16, 1517-1549. 

2. Dvoretzky, A., Kiefer, J. and Wolfowitz, J. (1956): Asymptotic minimax charac- 
ter of the sample distribution function and the classical multinomiial estimate. 
Ann. Math. Statist., 27, (1956) 642-669. 

3. Houda, M. (2001): Stability and Estimates in Stochastic Programming (Spesial 
cases) (in Czech). Diploma Work. Faculty of Mathematics and Physics, Charles 
University. 

4. Kankova, V. (1994): A note on estimates in stochastic programming, J. Corn- 
put. Math., 56, 97-112. 

5. Kankova, V. (1994): A note on the relationship between distribution function 
estimation and estimation in stochastic programming. Trans. 12th Prague Conf. 
1978, Academia, 122-125. 

6. Kankova (1994): On Distribution Sensitivity in Stochastic Programming. VZ 
UTIA AV CR No. 1826. 

7. Romisch, W. and Schulz, R. (1993): Stability of solutions for stochastic pro- 
grams with complete recourse. Math. Oper. Res., 18, 590-609. 

8. Shapiro, A. (1994): Quantitative stability in stochastic programming. Math. 
Progr., 67, 99-108. 

9. Shorack, G. R. and Wellner, J. A. (1986): Empirical Processes with Applications 
to Statistica. Wiley, London. 

10. Schulz, R. (2000): Some aspects of stability in stochastic programming, Ann. 
Oper. Res., 100, 55-84. 

11. Vogel, S. (1992): On stability in multiobjective programming - a stochastic 
approach. Math. Progr., 56 (1992) 91-119. 





Splitting and Localization of the Epi- Topology 
Combined with Randomness * 



Petr Lachout^^^ 

^ Department of Probability and Statistics, Charles University of Prague, 
Sokolovska 83, 186 75 Praha 8, Czech Republic 
^ Institute of Information Theory and Automation, Czech Academy of Sciences, 
Pod vodarenskou vezi 4, 182 08 Praha 8, Czech Republic. 

^ e-mail: lachout@karlin.mff.cuni.cz 



Abstract. The paper introduces an extension of the epi-convergence, the lower 
semicontinuous approximation and the epi-upper semicontinuous approximation of 
random real functions in probability. The new notions could be helpful tools for 
sensitivity analyzes of stochastic optimization problems. The research is evoked by 
[4] and continuous the research started in [6], [7] and [5]. 



1 Introduction 

The epi-convergence of real functions is a powerful tool for studying sensitiv- 
ity of optimization functions; see e.g. [1], [2]. Provided nice topological space 
the epi-convergence is induced by the epi-topology, i.e. a restriction of Fell 
topology on sets, see [1], [2], [3] for details. The epi-convergence is originally 
defined by two different properties. The first one is induced by miss-part of 
Fell topology and the other by hit-part of Fell topology. In [4] it is suggested 
to employ these parts separately, ’’splitting”, and consider them onto a given 
set, ’’localization”. The spiffed and localized parts of the epi-convergence are 
called lower and epi-upper semicontinuous approximations. Unfortunately, 
these notions are not induced by any topology. The paper offers a general 
concept how to overcome the difficulty and presents an extension to ran- 
dom functions to receive desired consequences for sensitivity of stochastic 
optimization problems. 

The epi-convergence itself is based on epi-topology. Hence, extension of 
it to random functions can be done in standard way, see [3]. Our case is 
not directly based on topology, because of restriction to a given set. We 
have to seek for more sophisticated description. A convenient description 
was introduced in [4], [6], [7] provided Euclidean space. The concept was 
generalized in [5]. 



* The research has been partially supported by Deutche Forschungsgemeinschaft 
grant No. 436TSE1 13/40, the project AV: K1019101, the project CEZ: MSM 
113200008 and the Czech Republic Grant 201/02/0621. 




420 



2 Epi-convergence and lower and epi-upper 
semicontinuous approximations 

This section deals with the epi-convergence of functions. We will work at a 
topological space 

Definition 1. Let us denote by FCE(A') the set of all functions / : A' -> M 
and 



(^(/; A) = inf /(x), where Ac X, 
xeA 

${f) = {xex : f{x) = ^{f-,X)}, 
epi(/) = e X xR : f{x) < t?| 

for any function / G FCE(A'). 

Let us recall that Fell topology restricted to the set of all epi-graphs is 
called epi-topology. In the paper, we treat more finer objects. 

Definition 2. Function / : R having epi-graph closed in A' x R is 

called lower semicontinuous (l.s.c.). We denote the set of all l.s.c. functions 
on AT by LSC(A'). 

Now, we introduce a lower and an epi-upper semicontinuous approxima- 
tions in accordance with [4]. 

Definition 3. Let < f\ >\eA be a net in FCE(A'), / G FCE(A') and 
0 ^ y C A'. We say: 

(i) fx is a lower semicontinuous approximation to / at F, 

notation is fx — > /, if 

XeA]Y 

sup liminf inf fx{y) > /(^) Vx G F; 

(ii) fx is an epi-upper semicontinuous approximation to / at F, 

. i • • r epi-u - . - 

notation is fx > /, n 

XeA;Y 

sup limsup inf fx{y) < /(x) Vx G F. 

xeG,GeQ(x) XeA 2 /eG 



(iii) fx is epi-convergent to / at F, notation is fx / ? 

if/A— 4r7^/and/A-4^ 



The introduced semicontinuous approximations are closely connected to 
sensitivity analysis of optimization problems. 





421 



Lemma 1. Let X be a compact < fx >xeA be in FCE(A') and f € LSC(A'). 

(i) If fx — y / then lim inf ^{fx-,H)>(p (/; H) for all H 6 . 

\£A,X XgA 

(a) If there is z e ^ (/) such that f\ > f then 

\£A;z 

limsup;^g^ (/? (/a; A”) < (^ (/; A"). 

(Hi) If fx y f and there is z E ^ (/) such that f\ > f then 

XeA;A^ XeA,z 

limxe/i ‘P ify, = (/; X) and K-limsup<? (/a) C if), 

\ 

where K-limsup denotes Kuratowski limes superior of sets. 

A proof of Lemma can be done in similar way as the proof of Theorem 
7.33, p.266 in [2]. The semicontinuous approximations possesses equivalent 
definitions employing hit- or miss- part of Fell topology. 

Proposition 1. Let X be a locally compact topological space, < f\ >\^a be 
a net in FCE(A'), / G LSC(A') and 0 / T C A'. 

Then f\ — y f if and only if for each G E g(x x M) , Y x C G 

and K E IC^X xM.^ such that epi(/) C\G D K = 0 and (T x R) fl AT 0 

there is Q E q{^X x R^ such that Y xWcQcG and ep\{f\) fl Q H if = 0 
eventually. 

Proposition 2. Let X be a topological space, < f\ >\^a be a net in FCE(A'), 
/ G LSC(A^) and^D^Y CX. 

Then fx f */ only if for each H E q(^X xR^ Hn{YxW) ^ ib 

fulfilling epi(/) Pi Q fl if ^0 for all Q E g(^X xR^, Y x R C Q we have 
^pK/a) n G n i? ^ 0 eventually for each G G 5 x R^ , G D T x R. 

3 Combining with randomness 

Now, we present a suggestion how to combine semicontinuous approximations 
with randomness. We consider net of random functions F : i? -> FCE(A'). We 
will denote by rFCE(A') the space of all random functions and by rLSC(A') 
the set of all random l.s.c. functions. 

Definition 4. Let A' be a locally compact topological space, < Fx >A€.i be 
a net in rFCE{X), F E rFCE(A') and 0 / F C A'. 

We define Fx — F if for each e > 0, G e afA" x iV F x 1 c G 
xeAiY V / 

and K E IC(^X xR^ such that {Y x R) fi if / 0 there is Q E g(^X xR^ 

such that Y xR C Q C G and 

lim sup P* (epi(FA) fl Q H if 7 ^ 0, epi(F) n G D if = 0) < 6 . 

Aeyi 





422 



Definition 5. Let A' be a topological space, < F\ >xeA be a net in rFCE(A'), 
F e rfCE{X) and 0 # F C A'. 

We define Fx ^ ^ e > 0, if € C? x l) , 

^ n (y X 1) ^ 0 there is G e g(^X x l), G D F x R such that 

lim sup P*(epi(FA) n G n fi = 0, epi(F) n G n if ^ 0) < e. 

A€/i 

The proposed approximations yield desired convergences of optimal values 
and optimal solutions. 

Theorem 1. Let X be a compact, < F\ >\^a be a net in rFCE(A') and 
Fg rLSC(A^). 

(i) If Fx > F then for all A eR 

\£A]X 

lim P* i^{Fx;K)<A<ip (F; K)) = 0 
for all K e K{X). 

(ii) If there is Z C X such that P,(Z n F (F) / 0) = 1 and 

epi-n-prob ^ ^ all A &R 

X£A-,Z 

lim p*{ip {Fx;X)>A>ifi (F; X)) = 0. 



^ nTob 

(in) If ip (F; X) is a random variable, F\ > F and there is Z C X 

AGvl;Af 

such that P.(Z n (F) 0) = 1 and Fx F then 

\£A\Z 



^iFx;X)-f^V>{F-,X). 

X£A 

If moreover p (F; K) is a random variable for all K G IC{X) then 

limsupP*(^(F\) C G) = 1 
xeA 

for allGeG{X),GD^{F), 

Proof 1. In the case (i) the sets from the definition 4 must fulfill 
G = Q = -F X 1 . Hence, H = Kx [-00, A]^lc(^XxR^ and 

limsupP*(¥j(FA;F) < A<if{F-,K)) < 

XeA 

< limsupP*(epi(FA) n (F x [-oo,zi])^0,epi(F) fl (F x [-oo,Z\]) = 0) 
AG^ 

= lim sup P* (epi(FA) H Q H F ^ 0, epi(F) H G fl F = 0) = 0, 

AG^ 

since the choice is independent on e: > 0. 





423 



2. To show (ii) take e > 0, 1/ € Q{^), U D Z and set H — U x [-oo,2l). 
According to Definition 5 then there is a F € Q{^), i-e. G = F x R, such 
that U dV D Z and 

lira sup P* ((p (Fa; X) > A > p {F] X)) < 

A6/1 

< limsupP*(epi(FA)n(F x [-oo,A)) = 0,epi(F) n (F x [-oo,2l))#0) 
A6/1 

= limsupP*(epi(FA) r\Gf)H = 0,epi(F) C\G D H ^ <b) < e. 
xeA 

3. Convergence of optimal values in probability in the case (iii) follows from 



limsup P*(^(F;A) + 2 Z\<<^(Fa;A)) < 

Agyl 



< Urn sup V P* (p (F; X) + 2A<ip{Fx;X) ,nA<ip (F; A) < (n + 1)^) 
+ P{\>fiF-,X)\>NA)< 

N 

< limsup P*{(p{F;X) < {n + 1)A < {n + 2)A < ip{Fx-,X)) + 

+ P{\p> (F; X) \>NA)^ P(|^ (F; X) \>NA), 



limsup P*((/?(F\; A) + 2A < tp{F;X)) < 

Ae/i 



< lim sup V P* (v? (F a ;X) +2A<p{F-,X) ,nA<p (F; A) < (n + 1)^) 
+ Pi\p{F-,X)\>NA)< 

N 

< limsup P*(v? (Fx; A) < (n — 1)A < nA < (/; (F; A)) + 

n^N 

+ P{\p (F; X) \>NA) = P(|yp (F; X) \>NA), 



The property of optimal solutions can be shown directly 

limsupP*(<?(FA ?:G)) < 

AG/l 



N 

< lim sup V' P*(^ (Fa ^ G) ,nA < p (F; X) < {n + 1)Z\) 
+ P(|^(F;A)| >iVA) < 





424 



N 

<limsup P* {^{F\ G) ,nA < ip{F]X) < {n-\-l)A, 

, (Fa; A' \ G) < (n + 1)A) + P{\ip (F; X) \>NA)< 

N 

< lim sup P{nA < (p (F; A') < (n + 1)A, p{F]X \ G) < {n + 2) A) 

nt^N 

+ P{\p>{F-X)\>NA)< 

< P{p>[F'X)>ip{F',X\G)-2A)^-P{\p{F]X)\>NA). 

Hence, 

lim sup P*(<? (Fa <t G)) < P{p (F; X)>p{F;X\ G)) = 0, 

AG^l 

since P(G D # (F)) — 1. 

References 

1. G. Beer (1993) Topologies on Closed and Closed Convex Sets. Kluwer Academic 
Publishers, Dordrecht. 

2. T. Rockafellar, R. J-B Wets (1998) Variational Analysis. Springer- Verlag, Berlin. 

3. G. Salinetti and R.J.-B. Wets (1981) On the convergence of closed-valued mea- 
surable multifunctions ^ Transactions of the American Mathematical Society, 266, 
pp. 275-289. 

4. S. Vogel (1994) A stochastic approach to stability in stochastic programming. 
J. Comp. Appl. Math. 56 , 65-96. 

5. P. Lachout (2000) Epi-convergence versus topology: splitting and localization. 
The research report of The Institute of Information Theory and Automation 
No. 1987, Prague. 

6. S. Vogel and P. Lachout (2000) On continuous convergence and epi-convergence 
of random functions I. The research report of The Institute of Information The- 
ory and Automation No. 1988, Prague. 

7. S. Vogel and P. Lachout (2000) On continuous convergence and epi-convergence 
of random functions II. The research report of The Institute of Information 
Theory and Automation No. 1989, Prague. 





Standards fur Modeilierung und Simulation 



Claus-Burkard Bohnlein 

Lehrstuhl fur BWL und Wirtschaftsinformatik, Universitat Wurzburg, 
NeubaustraBe 66, D-97070 Wurzburg, boehnlein@wiinf.uni-wuerzburg.de 



1 Entscheidungsunterstiitzung mittels Simulation 

Simulation ist das systematische Durchspielen des Verhaltens von geplanten, sich 
in der Entwicklung befindlichen oder bereits existierenden Systemen. Dabei wird 
ein Simulationsmodell zugrundegelegt, das die fur die Simulation relevanten As- 
pekte des betrachteten Systems nachbildet. Simulation ist somit das zielgerichtete 
Experimentieren an Modellen, die der Wirklichkeit nachgebildet sind (Oberweis et 
al. 1999; Bossel 1992). 

Bei der Analyse betrieblicher Ablaufe spielt die Simulation eine wichtige Rolle, 
da sie Aussagen iiber die Ablaufausfuhrung unter verschiedenen Bedingungen und 
beziiglich verschiedener Aspekte, etwa Kosten, Durchlaufzeiten, Ressourcenver- 
brauch oder Maschinenauslastung, erlaubt. Die Simulation leistete daher in der 
Vergangenheit eine wichtige Unterstiitzung bei der Ablaufjplanung, der Ablauf- 
verbesserung sowie bei der Gestaltung von Informationssystemen zur Ablaufaus- 
fuhrung. 



1 .1 Simulationszwecke 

Je nach Zielsetzung nennt OBERWEIS unterschiedliche Simulationszwecke, die 

nachfolgend mit steigender Komplexitat der Aufgabenstellung genannt werden 

(Oberweis et al. 1999): 

□ Das Durchspielen des Verhaltens real existierender Systeme wird eingesetzt bei 
Prasentationen und Schulungen. Hier steht die Animation zur Veranschauli- 
chung des Systemverhaltens im Vordergrund. 

□ In grolien Entwicklungsprojekten erlaubt die Visualisierung der Systemstruktur 
eine gezielte Kommunikation iiber geplante bzw. existierende Prozesse. 

□ Durch die Validierung von Systementwurfen kann am Simulationsmodell ixber- 
priift werden, ob sich das vorgeschlagene System wie geplant verhalt. 

□ Durch Antizipation von Systemverhalten in Bereichen wie Baustatik, Crashver- 
halten oder Klimaentwicklung erlaubt die Simulation das Austesten von Sys- 
temgrenzen und das fruhzeitige Erkennen von kritischen Systemzustanden. 
Dies ist dann bedeutsam, wenn Versuche am realen System zu teuer, zu risiko- 
reich oder grundsatzlich nicht moglich sind. 




426 



□ Haufig wird die Simulation eingesetzt, um die Reaktion eines Systems auf ex- 
teme Einflusse oder Stmktur- bzw. Regelandemngen im System zu untersu- 
chen und zu verbessem. 



1 .2 Problemlosungsprozess 

1st ein simulationswurdiges Problem erkannt, dann wird es in einem mehrstufigen 
Prozess gelost. Zuerst werden das Problem moglichst genau beschrieben und die 
erforderlichen Daten erhoben. Danach wird das Problem analysiert und in ein abs- 
traktes Modell uberfuhrt. Das Modell wird mit Vergangenheitsdaten validiert und 
gegebenenfalls korrigiert. Verschiedene Modell- und Parametermodifikationen 
werden in iterativen Simulationslaufen uberpriift, um gegeniiber der Ausgangssi- 
tuation verbesserte, giiltige Losungen zu ermitteln, und zur Umsetzung vorge- 
schlagen (vgl. Abb. 1). 




Abb. 1. Problemlosungsprozess 

Bis zu diesem Zeitpunkt wurde nur Aufwand produziert, erst nach der Umsetzung 
kann die verbesserte Losung auch nutzbringend eingesetzt werden. Dies setzt aber 
voraus, dass das Systemumfeld sich nicht verandert hat, andemfalls wurde im un- 
giinstigsten Fall die vorgeschlagene Losung unbrauchbar. In dynamischen Sys- 
temumgebungen ist es deshalb entscheidend, dass ein Problem moglichst schnell 
erkannt und gelost wird, damit die ermittelte Losung unmittelbar nach der Umset- 
zung genutzt werden kann. 



2 Markte im Wandel 

In der Vergangenheit vollzog sich ein Wandel vom Verkaufer- zum Kaufermarkt, 
der durch eine VerschMimg des Wettbewerbs gekennzeichnet war. In zunehmend 
gesattigten Markten sind Erzeugnisse schlechter absetzbar und die Abnehmer 
konnen zwischen verschiedenen Anbietem vergleichbarer Produkte auswahlen. 
Die Hersteller versuchen in dieser Situation durch kundenindividuelle Produkte 





427 



und kurze Produktlebenszyklen die Kundenbindung zu erhohen und sich von ihren 
Mitbewerbem zu differenzieren. Da eine zunehmende Variantenvielfalt in 
der Regel mit sinkenden Losgrofien einher geht, steigt neben dem Koordinations- 
aufwand aber auch der Kostendruck in den Untemehmen. Diese beginnen deshalb 
sich auf ihre Kemkompetenzen zu konzentrieren, ihre Wertschopfimgstiefe zu re- 
duzieren und beziehen zunehmend Leistungen von extemen Anbietem. Kemkom- 
petenzen sind in diesem Zusammenhang wesentliche technische, technologische 
und organisatorische F^gkeiten, die durch eine hohe Wettbewerbswirksamkeit 
auf den M^kten und durch einen groBen Wettbewerbsvorteil gegenuber der Kon- 
kurrenz gekennzeichnet sind. 



3 Anforderungen an Untemehmen 

Da mit wachsender Lieferantenanzahl auch der Koordinationsaufwand ansteigt, 
gehen die Hersteller Entwicklungs- und Produktionspartnerschaften mit sogenann- 
ten Systemlieferanten ein. Dadurch konnen aus Sicht der Hersteller einerseits die 
Entwicklungszeiten fiir neue Produkte gesenkt und andererseits die intemen Be- 
schaffungs- und Produktionsprozesse gestrafft werden. Diese Aufteilung der 
Wertschopfimg auf mehrere Untemehmen erfordert aber zusatzliche koordinie- 
rende MaBnahmen, denn die verteilte Leistungserbringung bedingt eine genaue 
Abstimmung der Termine, Kapazitaten und Mengen in den unterschiedlichen Be- 
reichen aller beteiligten Untemehmen. Zur Losung dieser Koordinationsaufgaben 
zeichnet sich fur die kommenden Jahre ein weiteres Zusammenwachsen der integ- 
rierten Geschaftsabwicklung und der untemehmensubergreifenden Koordination 
der erforderlichen Material- und Informationsfliisse ab. 

Konkret bedeutet dies fur Untemehmen, sie miissen sich 

□ schnell und flexibel am Markt orientieren, 

□ Mitarbeiterpotentiale schopfen und entwickeln, 

□ Kooperationen und Netzwerke nach innen und auBen aufbauen und 

□ verst^kt integrierte Informationssysteme nutzen. 

Fur die Abstimmung und Koordination in derartigen untemehmensubergreifen- 
den Netzwerken werden Funktionen zur Durchfuhmng von What-If-Analysen und 
Simulationsmoglichkeiten benotigt, die nicht nur auf strategischer Ebene, sondem 
zunehmend auch auf taktischer und operativer Ebene eine Entscheidungsunterstiit- 
zung bieten. 



4 Status quo 

Gemessen am Marktvolumen gibt es sehr viele verfugbare Simulationsumgebun- 
gen. Eine Moglichkeit zum Datenimport und -export fiber ASCII, DDE, ODBC 
Oder SQL besitzen inzwischen alle Werkzeuge, ebenso eine leistungsfahige gra- 
phische Benutzeroberflache (z. T. auch intemetf^ig). Vereinzelt werden auch in- 
tegrierte Code-Generatoren angeboten, die direkt aus dem Modell ausfuhrbaren 





428 



Programmcode erzeugen (z. B. PACE von IBE). Kommerzielle Simulationsumge- 
bungen verfugen zum Teil auch ixber Schnittstellen zu g^gigen ERP-Systemen. 
Einzelne Softwarehersteller bieten branchenspezifische Planungslosungen an, in 
die eine leistungsfahige Simulationsumgebung integriert werden kann (z. B. eM- 
Power und eM-Plant von Tecnomatix). 



5 Simulation in integrierten IT-Umgebungen 



GRAF und PUTZLOCHER unterscheiden den Produktentwicklungs-, Produktent- 
stehungs-, Kundenauftrags- und Materialbeschaffungsprozess (vgl. Abb. 2). Der 
Erfolg einer Untemehmung hangt dabei entscheidend von einer ganzheitlichen 
Betrachtung und Koordination aller genannten Prozesse ab (Graf u. Putzlocher 
2002). Schlagworte wie Digital Mock-Up und digitale Fabrik (Bracht U 2002) 
zeigen, dass in den Schlusselbranchen wie der Automobilindustrie die Integration 
der Entwicklungssysteme vorangetrieben wird und neuere Planungssysteme sog. 
Advanced Planning and Scheduling Systems (APS) verfiigen bereits ixber Simula- 
tionsfunktionen fur What-If-Analysen. 




Abb. 2, Geschaftsprozesse (Graf u. Putzlocher 2002) 

Als gemeinsame Datenquelle fur die operativen Systeme wird zunehmend ein 
Produktdatenmanagement eingesetzt und mit Systemen fiir Business Intelligence 
werden Data Mining oder Text Mining Techniken genutzt, um aus Daten fiber In- 
formationen Wissen abzuleiten (Bange C, Mertens H, Keller P 2001; Vo6 S, Gu- 
tenschwager K 2001). 

Doch wo bleibt die dynamische Simulation, wenn operative Systeme zuneh- 
mend ixber Simulationsfunktionen verfugen und dadurch die zeitaufwandige Da- 
tenbeschaffixng und Modellierung entfallen? 






429 



6 Anforderungen an Modellierung und Simulation 

Leistungsfahige Simulationsumgebungen fur dynamische Systemanalysen werden 
auch zukiinftig ihre zentrale Bedeutung behalten, wenn die genannten Defizite be- 
hoben werden. Dazu miissen aus heutiger Sicht folgende Anforderungen umge- 
setzt werden: 

1. Fiir den Datenaustausch mit operativen Systemen miissen einheitliche Schnitt- 
stellen geschaffen werden. 

2. Durch die Entwicklung einer XML-basierten Sprache (Pardi W 1999) mit spe- 
ziellen Dokumenttyp-Definitionen (DTD) konnen Simulationsanwendungen in 
einem betriebswirtschaftlichen Umfeld mit OLAP- oder Data-Mining-Werk- 
zeugen sowie mit relationalen Datenbanken effektiv kommunizieren konnen. 

3. Kurzfristige Aussagen und Analysen sind nur dann moglich, wenn der Zeitbe- 
darf fur Datenbeschaffimg und Modellierung deutlich reduziert wird. Dazu 
miissen operative Systeme mit Simulationsumgebungen integriert werden. 

4. Fiir eine systematische, effiziente Modellierung und Wiederverwendung von 
Teilmodellen muss nach dem Vorbild der Unified Modelling Language (UML) 
eine Unified Simulation Modelling Language (USML) entwickelt werden. Dies 
wiirde den Austausch von Modellen und die Interaktion zwischen verteilten 
Modellen unterstiitzen sowie die Entwicklung von Modellgeneratoren fordem. 

5. Open Source Simulationsumgebungen miissen gefordert werden, damit nach 
dem Vorbild von LINUX bereits bei der Softwareentwicklung Standards etab- 
liert werden und die Vielfalt der Simulationsumgebungen auf eine sinnvolle 
GroBenordnung reduziert wird. 



Aspekte der Standardisierung von Modellierungstechniken, Schnittstellen und 
Austauschformaten werden zukiinftig sowohl fur die Anwender, als auch fiir An- 
bieter von Simulationsumgebungen eine zentrale Bedeutung erlangen. Standards 
sind in integrierten IT-Umgebungen kooperierender Untemehmen eine notwendi- 
ge Voraussetzung fiir eine Entscheidungsunterstiitzung im operativen Bereich. 




430 



Literatur 

Bange C, Mertens H, Keller P (2001) OLAP und Business Intelligence. Oxygon, Feldkir- 
chen 

Bossel H (1992) Modellbildung und Simulation - Konzepte, Verfahren und Modelle zum 
Verhalten dynamischer Systeme. Vieweg, Braunschweig Wiesbaden 
Bracht U (2002) Ansatze und Methoden der Digitalen Fabrik. Eroffhungsvortrag im Ta- 
gungsband "Simulation und Visualisierung 2002", 28.02 und 01.03 Uni Magdeburg 
Graf H, Putzlocher S (2002) DaimlerChrysler: Integrierte Beschaffungsnetzwerke. In: 
Corsten D, Gabriel C (Hrsg.) Supply Chain Management erfolgreich umsetzen - 
Grundlagen, Realisierung und Fallstudien. Springer, Berlin Heidelberg New York 
Oberweis A, Lenz K, Gentner C (1999) Simulation betrieblicher Ablaufe. In: Hartmann- 
Wendels et al. (Hrsg.) wisu - das wirtschaftsstudium, 28 (1999) 02, Diisseldorf, S 216- 
223, 245 

Pardi W (1999) XML in Action. Dynamische und datengestutzte Webseiten mit der neusten 
Web-Technologie. Microsoft Press, UnterschleiBheim 
VoB S, Gutenschwager K (2001) Informationsmanagement. Springer, Berlin Heidelberg 
New York 





System Dynamics (SD) - An Approach within 
Corporate Pianning 



Peter Bradl 

Bavarian Research Center for Knowledge-based Systems - FORWISS, Informa- 
tion Systems Research Group, AuBerer Laufer Platz 13-15, 90403 Nuremberg, 
Germany 



1 Corporate Planning 

Increasing speed in business, engagement on international markets, as well as ris- 
ing costs for mismanagement and wrong decisions require that effects of changes, 
both endogenous and exogenous, can be foreseen. At least, it is important to have 
the possibility to rewrite existing action plans according to the current situation 
almost instantly. The use of models that are built and validated according to the 
companies current situation (including market information etc.) not only allows 
playing with the status quo but is the basis for simulations that provide results on 
possible future outcomes. 



1.1 Management Decisions and Pianning 

The difficult situation on the market and globalization are frequently mentioned 
by CEOs when they have to explain (e.g. to shareholders) why their company does 
or did not perform as expected. Enterprise Resource Planning (ERP) software 
packages already allow for static planning. Various statistical methods like expo- 
nential smoothing are known to correct planning figures; simulation tools are in 
use to some extent. Still, most of these tools are extremely useful predominately 
for operational planning. However, Corporate Planning which is generally ac- 
cepted as core management functionality for a long time (Lyneis 1980, Perlitz 
2000, Ross 1993) goes beyond this and yet is hardly supported by Information 
Systems. 



1.2 Planning Approaches 

In management literature several approaches to planning were introduced during 
the last decade (e.g. Total Quality Management, Business Process Reengineering 
or the Balanced Scorecard (see Maani a. Cavan 2000 pp. 4-5)). They mostly ad- 
dressed a certain field or operational area. Besides the Balanced Scorecard there 
were only few individual techniques linking management aspects (Macharzina 




432 



1999 p. 166). Organizational theory itself hardly allows the usage of figures, how- 
ever, figures carry and represent the type of information managers are used to. 
Since planning strategies change over time, (Macharzina 1999 p. 158), they re- 
quire a method that takes this into account as well - SD. 



2 System Dynamics 

2.1 Systems Thinking 

Underlying the method of SD and the tools to apply it there is the need to become 
sort of a systems thinker. This is e.g. to recognize that feedbacks are part of all 
systems (department, company, market) we belong to. It implies that our own ac- 
tions will affect us later probably in a way we did not intend. An example: 

In August 2000 the auction of UMTS licenses for Germany was held yielding 
about 100 billion Deutschmarks for the treasury. ” Deutsche Telekom AG” 
alone had to pay more than 16 billion Deutschmarks which had to be financed. 
As a result of increasing debts Deutsche Telekom 's capital structure changed. 
Consequently the rating agencies disrated the company, which in turn usually 
causes a risk premium for future debts - the feedback loop with external vari- 
ables is closed. (Moody *s 2002; Voba-Borse 2001) 

More often than not dependencies can not be followed as easy as in the case of 
the “Deutsche Telekom AG”, however it is important to realize that those effects 
exist and that managers and everyone else involved into Corporate Planning have 
to consider this. 



2.2 Modelling as Part of the Management Process 

System thinking as the “starting point” for mental models has to be followed by 
some steps to finally derive Stock Flow Diagrams (SFD) representing SD model 
for simulations. 

As intermediate stage towards SFDs so-called Causal Loop Diagrams (CLD) 
which allow identifying interactions and relationships between actions and e.g. 
their effect onto different departments, are commonly used. Richardson describes 
some serious problems that might occur when transforming CLDs into SFDs 
(Richardson 1986). But still there is a benefit when using CLDs, especially in 
Corporate Planning, since they may function as a type of communication tool. A 
main problem in decision-making processes (even on the corporate level) is par- 
ticipants having different understandings and terminologies. (E.g. sales may be in- 
terpreted as sales quantity or turnover. The use of CLDs by applying modelling 
techniques as described e.g. in (Gomez a. Probst 1999) or (Maani a. Cavan 2000) 
within a group enforces the members to use terms in their understanding within 
the design process team. Finally, when discussing dependencies, different inter- 




433 



pretations become obvious. So modelling (especially CLDs) helps to gain a com- 
mon understanding of relevant terms. * 



2.3 Utilisability and Simulation 

2.3. 1 Hardware and Software 

An important fact for the acceptance of SD in management - especially on corpo- 
rate level - is that the models can be used anywhere and without detailed knowl- 
edge in the field of computer science (as it is desirable for most IT in manage- 
ment). Nowadays, several SD software packages (e.g. Vensim, Stella, or 
Powersim) run on standard PC or Macintosh computers, equipment that is avail- 
able in almost every office - even on management level. Additionally, graphical 
user interfaces may be designed to support the manager or to provide help when 
entering figures e.g. when checking for consistency. Actually the underlying ana- 
lytical model is usually hidden for the user and only sliders to change inputs and 
charts will be seen. 

2.3.2 Validation and Simulation 

Due to the wide-spread use of ERP-software in companies, data of almost every 
transaction may be stored in a database or data warehouse respectively. These fig- 
ures are not only important for the profit and loss account or preparing the balance 
sheet but are extremely helpful for usage in models that support Corporate Plan- 
ning. Modem SD-software packages allow to directly transfer data from the data 
warehouse into the modelling software, simulate and may even write the results 
back into the data warehouse. Hence, validation of models in financial planning 
might be much easier compared to topics like environmental issues. Simulation 
runs are the final stage in this process.^ 



3 Corporate Planning - Knowledge Management within a 
Software Company 

At the Bavarian Research Center for Knowledge Based Systems (FORWISS) in 
cooperation with SAP AG we are developing reference models for financial plan- 
ning that are based on relevant Key Performance Indicators and datasets trans- 
ferred from the Business Information Warehouse (BW) of the SAP/SEM to sup- 
port the whole Strategic Planning procedure. Since employees and knowledge 
represent important assets of a company we currently focus on these issues. An 
exemplary question within the planning process for human resource management 
is the following: 



* General information about SD and details how to derive SFDs can be found in Coyle 
1996, Ford 1999, Forrester 1968, Goodman 1974, Maani a. Cavan 2000; Sterman 2000. 

^ Details covering issues like simulation and scenarios can be found in Gausemeier et al. 
1996, Mertens 1982. 





434 



What are the effects if management increases the salary of the employees by 10%? 

Obviously enough, costs will increase. As an effect of assumed higher motiva- 
tion, performance may improve. Additionally, the higher salary will lead to a posi- 
tive reputation of the enterprise as potential employer and attract more and better- 
qualified employees while, in turn, fluctuation rates decrease resulting in overall 
declining recruitment costs. This way, the positive effects might go beyond the 
negative effects. The models will show whether this scenario is true and if it ap- 
pears, to what extent and under which assumptions. 

Presently, we focus on a specific topic within the field of Human Resource 
Management - Knowledge Management. Software firms strongly depend on man- 
power and knowledge - as a resource - linked to the employees. Although there 
are little investments to be made into machines within this industry - compared to 
e.g. printing companies - high costs (e.g. providing loans at attractive conditions 
to employees) have to be covered for being ‘attractive’ for the ‘resource man’. So 
the strategic task and major point of interest is ‘ensuring knowledge’ by qualifying 
staff. The hiring process itself and the transition of newly hired employees to ex- 
perienced workers have already been covered in several models (see Maani a. 
Cavan 2000; Sterman 2000). 



3.1 Key Issues of Knowledge Management 

In our current model we look at a department within a software company. Besides 
hiring and attrition we focus on the qualification process. A specific situation: 

A department of about 100 software engineers which is currently responsible 
for a product that will be replaced. The new platform will be programmed in 
a different language. Management wants to know how to do the training. 

To answer questions that arise from this situation, we start with the description 
of a classification that distinguishes three different types of knowledge: Program- 
ming, Business Administration, and Social Skills.^ 



3.2 A Scale for ‘Knowledge’ 

Referring to the availability of knowledge we developed a classification that uses 
five expressions: Null, Poor, Average, Good, and Expert. An employee holds a 
certain amount of ‘knowledge imits (KUs)’ that lead to a position within the quali- 
fication groups. A change from one group into another can be achieved by various 
steps. It is possible to gain KUs as well as to lose them within a certain category. 
Gaining KUs is possible in two ways Experience and Training. The loss of KUs 
on the other hand is predominately an effect of time passing by. We distinguish 
the two alternatives ''Forgetting” and Obsolete Knowledge. 



^ For details about assumptions and classifications for the case see Bradl 2002. 





435 



In order to be able to remove people from a group and add it to another we de- 
plete them according to a distribution key that is initially set and adapted when- 
ever peoples gain (or lose) qualification. 



3.3 Costs for 'Knowledge’ 

As mentioned earlier Corporate Planning implies financial planning which is 
strongly related to human resource management. In our model we consider that 
laying off people is costly in several aspects. Compensation payment has to be 
calculated as well as additional costs for hiring. Additionally, so-called ‘rookies’ 
have to be integrated and must learn the rules of the company - which takes time - 
and are usually less efficient during induction period. On the other hand salaries 
are lower for new members of the staff (for non-specialists). Furthermore, there 
are fewer costs for training for this group since it is supposed that the new em- 
ployees have the right qualification. Talking about training and costs leads to deci- 
sions about the type of training. On the job or fulltime; internal or external; team- 
by-team or only a certain percentage of the team at once? Revenue that might be 
lost due to reduced manpower while employees are in training is an important fac- 
tor when talking about training costs. 



3.4 Used Software and Data Support of the Model 

At our institute we use Powersim as the environment for SD modelling. The ex- 
pressions for knowledge are implemented with the array functionality in this soft- 
ware. We design personalized ‘portals’ to allow simulation for different members 
of the management and to ensure that the accessibility supports the use of the ex- 
isting model. The parameters for dependencies within the model or historical data 
are put in via external data sources, like Excel or the BW. Besides the assurance 
that simulations run with real data, results will be written back into the BW for 
further calculations even within the planning area of the company’s ERP or some- 
thing else. 



3.5 Scenario 

Within one particular simulation run emphasis was put on the question which 
training approach should be preferred - full time (absence from daily work) or on 
the job. Assumptions were made about correlation between knowledge (as effect 
of training) and efficiency as well as the dependency of wages and efficiency. The 
scenarios showed that the chosen company will gain higher profits when training 
is performed on the job. It is clear that untrained employees perform not as well as 
others but they still cover workload and ensure that the company is able to com- 
plete orders in time. 





436 



4 Conclusions and Outlook 

As mentioned above, dependencies in complex systems cannot intuitively be rec- 
ognized. Hence, a method must be used to identify them. System Dynamics does 
this - and more. Even the simulation of only three types of scenarios - best, worst 
and most realistic case - for one variable will help to make plans in advance. Thus 
managers will be prepared what to do if the anticipated scenario arises in reality. 
The “low” Software and Hardware requirements nowadays allow the use of SD 
almost everywhere and should help to bring SD into management. 



References 

Bradl P (2002) Strategic Enterprise Planning with System Dynamics. In: Proceedings of the 
2nd International Conference on Systems Thinking in Management. John Mason Print- 
ers, Skipton 

Coyle RG (1996) System Dynamics Modelling - A Practical Approach, Chapman & Hall, 
London 

Ford A (1999) Modelling the Environment, Island Press, Washington, D.C. et al. 

Forrester JW (1968) Principles of System, Wright- Allen Press, Cambridge MA 

Gausemeier J, Fink A, Schlake O (1996) Szenario-Management, Hanser, Miinchen 

Gomez P, Probst G (1999) Die Praxis des ganzheitlichen Problemlosens, 3"* edn, Bern et al. 

Goodman MR (1974) Study Notes in System Dynamics, Wright- Allen Press, Cambridge 
MA 

Lyneis JM (1980) Corporate Planning and Policy Design, MIT Press, Cambridge, Massa- 
chusetts 

Maani KE, Cavan RY (2000) Systems Thinking and Modelling - Understanding Change 
and Complexity, Pearson Education, New Zealand 

Macharzina K (1999) “Untemehmensfuhrung” Das intemationale Managementwissen, 3'*^ 
edn, Gabler, Wiesbaden 

Mertens P (1982) Simulation, Poeschel, Stuttgart 

Moody’s (2002) http://www.moodys.com (2002-08-29) 

Perlitz M (2000) Internationales Management, 4*** edn, Lucius u. Lucius, Stuttgart 

Richardson GP (1986) Problems with Causal Loop Diagrams. In: System Dynamics Re- 
view 2, pp. 158-170 

Ross SA, Westerfield RW, Jaffe IF (1993) Corporate Finance, 3'^* edn, Irwin, Homewood, 
Illinois et al. 

Sterman JD (2000) Business Dynamics - Systems Thinking and Modeling for a Complex 
World, McGraw-Hill, Boston et al. 

Voba-Borse (2001) http://voba-boersedirekt.de/ticker00/20000821ticker.pdf (2001-12-20) 




Optimal Decision Rules in a Monetary Union 



Doris A. Behrens^ and Reinhard Neck^ 

^ Vienna University of Technology, Department of Operations Research and 
Systems Theory, Argentinierstrasse 8/119, A-1040 Vienna, Austria 
^ University of Klagenfurt, Department of Economics, 

Universitaetsstrasse 65-67, A-9020 Klagenfurt, Austria 



Abstract. This paper develops a dynamic game model to study strategic interac- 
tions between the decision-makers in a monetary union. In such a union, govern- 
ments of the participating countries pursue national goals when deciding on fiscal 
policies, whereas the common central bank’s monetary policy aims at union-wide 
objective variables. Considering the example of a negative demand shock, we show 
how different solution concepts for the dynamic game between the common central 
bank and the national governments can be used as models of a conflict between 
national and supra-national institutions (noncooperative Nash equilibrium) and of 
coordinated policy-making (cooperative Pareto solutions). 



1 Introduction 

Policy-making for a national economy should be supported by a careful anal- 
ysis of its objectives, its constraints, and the possibilities of achieving an out- 
come which is in some well-defined sense “better” than other available alter- 
natives. Operations research and economics have produced many theoretical 
models in order to provide guidelines for arriving at “optimal” solutions for 
economic decision problems. In the context of fiscal and monetary policies 
in an international context, especially in a monetary union, where different 
objectives of policy-makers are nearly inevitable, game theory is an adequate 
tool to analyze and improve policy-making. Given the intertemporal nature 
of macroeconomic policy problems, the toolkit of dynamic game theory (see 
[1], for instance) recommends itself for obtaining insights and policy recom- 
mendations for decision-makers (governments of member countries and the 
common central bank) in a monetary union (see [2]). 

Those mathematical models for such a macroeconomic system, which give 
a (largely) realistic picture for the real-world decision problem of concern, 
are rather soon reaching the limits of analytical tract ability. Therefore, in 
this paper we will use the OPTGAME 2.0 algorithm [3] to analyze a simple 
policy problem in a two-country monetary union. This numerical algorithm is 
designed for determining solutions of dynamic difference games with a finite 
planning horizon. In particular, OPTGAME solves discrete-time LQ games, 
and approximates the solutions of nonlinear-quadratic difference games by 
iteration. At present, the algorithm calculates the open-loop and the feed- 
back Nash equilibrium solution and the cooperative Pareto-optimal solutions 



Operations Research Proceedings 
© Springer- Verlag Heidelberg 2003 



© Springer- Verlag Berlin Heidelberg 2003 




438 



for an arbitrary number of players; extensions to other solution concepts are 
being implemented. Here we will show that calculating different solution con- 
cepts for a dynamic game between the common central bank and national 
fiscal policy-makers can provide insights into possible conflicts and their so- 
lution in this context. 



2 The Model 

We consider a monetary union with two participating countries. Monetary 
union means that national currencies (national central banks) have been en- 
tirely replaced by a common currency (common central bank). Among others, 
this implies that the exchange rate has disappeared as an instrument of ad- 
justment. In the following description of the model, capital letters indicate 
nominal values, while lower case letters correspond to real values. The two 
countries are assumed to be of equivalent size in terms of gross domestic prod- 
uct (GDP). The superscripts ^ and ® denote demand and supply, respectively. 
The supply side is mostly exogenous. 

The demand side goods market is modeled by a short-run income-expend- 
iture equilibrium relation (IS curve), which is superimposed on an exogenous 
natural growth path. For t = 1,...,T, real output in country z (i = 1,2) is 
given as the sum of the long-run equilibrium level of the real output, and 
the short-term deviation therefrom, yu^ i.e. 

yit = yit + m, (i) 

where 

yu = (1 + 0)yi(t-i), m given, (2) 

Vit = 6i (^ - l) - 7i (J’it - 0) + piyjt + rjifit + Zit, (3) 

for i ^ j (i, j = 1,2). The variable Pit{i = 1,2) denotes country i’s output 
price (its general price level), rit{i = 1,2) represents country i’s current real 
interest rate, and fit{i = 1,2) denotes country i’s short-term (deviation from 
a zero) real fiscal deficit. fit{i = 1,2) in (3) is country Vs fiscal policy instru- 
ment, i.e. its control variable. The natural real growth rate, 0, is assumed 
to be equal to the natural real rate of interest (assuming dynamic efficiency, 
in accordance with neoclassical growth theory). The parameters Si, ji, pi, 
7 ]i {i = 1,2) in (3) are assumed to be positive. The variables zu and Z 2 t are 
not subject to control and represent exogenous shocks on the demand side 
goods market. 

For t = 1, ..., T, the current real rate of interest for country i (i = 1, 2) is 
given by 

= Rst - Xit, ( 4 ) 





439 



where Rei denotes the common nominal rate of interest determined by the 
common central bank, and Xu {i = 1,2) represents country i’s rate of in- 
flation. Note that the equilibrium level of the natural long-run interest rate, 
Rei = Tit = 6 {i = 1,2), is “inflation-free”, i.e. Xu = 0 {i = 1,2). Output 
prices and inflation rates for i = 1, 2 and t = 1, ..., T are determined according 
to a demand-pull relation: 

Pit = (1 -h Xu) P\o given, (5) 

Xu — (fl) 

where and ^2 are positive parameters. We also deflne average variables as 

yEt=u)yifi‘{^-(^)y2t, o;€[0,l], (7) 

- uXit + (1 - a;) X2t, a; G [0, 1] . (8) 

Money demand in country i (i = 1, 2) is the sum of long-run and short- 
run money demand: 

m% - mft + mff (9) 

Short-run money demand is determined by a Keynesian money demand func- 
tion (LM curve): 

mft = KiVit - Xi [Rst - 6) ■ (10) 

Here (i = 1, 2) are positive parameters, 6 is the natural rate of interest, 
and Rst denotes the common nominal interest rate. Due to the long-run 
equilibrium relations, yu = 0, Xu = 0, and fu = 0 (i = 1,2), long-run 
equilibrium money demand is given by 

= KiPit. ( 11 ) 

Hence, there is no money illusion, and the Cambridge equation holds in the 
long run. This leaves us with the following equilibrium relationship for the 
long-run quantity of money (both demand and supply) in country i {i = 1,2): 

Mft = Mi = Pitmi = PitKi (1 + 9) vnt-i ) . (12) 

This means that in each country, the price level will stay constant in the long 
run if money supply Mft grows at the natural rate 6. In a monetary union, 
the sum of the countries’ money demands has to be equal to the monetary 
union’s money supply. 

In addition, we assume the money market always to clear in the short- 
run, too, and hence money supply to be equal to the sum of short-run money 
demands in countries 1 and 2, 

= + (13) 

This leads to 



^Et — l^iyitPlt + K,2y2tP2t — (AlFlt + \2P2t) {ReI “ ^) • (14) 





440 



Note that this implies that the short-run real rates of interest in the two 
countries can considerably diverge both from each others and from the long- 
run (natural) real rate of interest. 

The government budget constraint is given as an equation for government 
debt of country i (i = 1, 2), 



Dit — (l 4- ReH-i)) + Pit — Dio given, (15) 

where the nominal fiscal deficit of country i (z = 1,2) is determined by the 
identity 

Pit ~ Pit fit = Pit fit- ( 16 ) 

BEt denotes the short-term deviations of high-powered money, BEt-> from 
its long-run equilibrium level, BEt- The equilibrium stock of high-powered 
money is assumed to grow geometrically at the natural rate d. Hence, 

BEt = PEt + BEt = (1 + 0) BE{t~i) 4- BEt, Beo given. (17) 

BEt represents the control variable of the common central bank. The change 
in high-powered money is distributed as seigniorage to the two countries 
according to given positive parameters Pi G [0, 1] and /?2 := — Assuming 
a constant money multiplier, z/;, the broad money supply of the monetary 
union is given by 

- ^BEt- (18) 

Both national fiscal authorities are assumed to care about stabilization of 
inflation, output, debt, and fiscal deficits of their own countries. The common 
central bank is interested in the stabilization of inflation and output in the 
monetary union and in a low variability of its supply of high-powered money. 
Hence, the individual objective functions of the national governments and of 
the common central bank are given by 

1 T ✓ 1 \ t 

~ 2 ^ V ^ ^ (^^iX^it ^iy {Vit ~ Vit) "J" ^wBit 5 

1 T* ✓ 1 \ t 

= 2 ^ j ~ otEsBEtj , (20) 

where all weights are positive numbers G [0, 1]. The joint objective function 
for the calculation of the cooperative Pareto-optimal solutions is determined 
by J = /ii Ji + /i 2 J 2 4- ^eJe (a^i, A^ 2 , Me > 0, mi + M 2 + Me = !)• 

The parameters of the model are specified numerically in the simplest 
possible way, leaving us with a symmetric monetary union (see Table 1). 
Lack of space precludes a detailed discussion of the parameter values chosen, 
the target values assumed for the objective variables of the players (which are 
basically the long-run equilibrium values of the respective variables), and the 



441 



Table 1. Parameter values for « = 1, 2 



Parameter T 9 


Si 7i Pi rji 




UJ Ki 


\i 


A 


x/j a’s 


Mi 


Value 20 0.03 0.5 0.5 0.5 1.0 


0.25 


0.5 1.0 


0.15 


0.5 


2.0 1.0 


0.33 




Table 2. Initial values for i = 


1,2 








Variable 


ViO Vio PiO XiO 


Dio 


Reo 


Beq 


fiO 


Beq 




Value 


10 10 


0 


e 


1 


0 


0 





initial values of the state variables (see Table 2) of the dynamic game model. 
Detailed information on this is available from the authors on request. 

Equations (l)-(20) constitute a nonlinear dynamic game with a finite 
planning horizon, where the objective functions are quadratic in the devia- 
tions of state and control variables from their respective desired values. 



3 Optimal Fiscal and Monetary Policies 

Several experiments were performed with the model, using different assump- 
tions about the paths of the exogenous non-controlled variables. For lack of 
space, we report only the results of one of them. This is a symmetric shock 
acting on both economies. In particular, we assume that autonomous real out- 
put (GDP) in both economies falls by 1.5% below the long-run equilibrium 
path for the first four periods and less for the next three periods: ziq = 0, 
Zii - Zi 2 = Zi3 = Zi4 = -0.015, Zi5 = -0.01, Zie = -0.005, Zir = -0.0025, 
and Zit = 0 for ^ > 8 and i = 1,2. 

Without policy intervention, this demand side shock leads to lower output 
and inflation (compared to the long-run equilibrium path) during the first 
five periods, but higher output and inflation afterwards (see Fig. 3). That is, 
the uncontrolled dynamic system adjusts in dampened oscillations, getting 
close to the long-run path after twenty periods. The maximum deviation of 
output from its equilibrium path is approximately 0.33% (in the first period). 
That is, even without policy intervention there are sufficiently strong negative 
feedbacks in the system to reduce the impact of the shock on output to about 
one fifth of the original shock at most in the case of a temporary symmetrical 
shock. This is mainly due to a strong reaction of the rate of interest, which 
falls to values near zero in the first four periods (but rises above the long-run 
value afterwards). Due to the symmetry of the economies and of the shock, 
the reactions of all variables are identical in both economies. 

When policy-makers are assumed to react on this shock according to 
their preferences as expressed in their objective functions, several outcomes 
are possible, depending on the assumptions made about the respective other 





442 



policy-makers. Here we consider two noncooperative equilibrium solutions 
of the resulting dynamic game, the open-loop and the feedback Nash equi- 
librium solution, and one cooperative solution, the Pareto-optimal collusive 
solution (all players get the same weight jUi = 1/3, i = 1, 2, E). The feedback 
Nash equilibrium solution is more interesting than the open-loop one because 
the former is subgame perfect or Markov perfect, while the latter is only valid 
if it is assumed that all policy-makers commit themselves unilaterally and 
decide upon trajectories of their instrument variables once for all at t = 0. 

The time paths of the control variables - real fiscal deficit (for either 
country) and additional high-powered money - under the three solution con- 
cepts considered are shown if Figs. 1 and 2, respectively, those of the state 
(and objective) variables - deviations from long-run equilibrium output and 
government debt - in Figs. 3 and 4, respectively. Infiation rates show the 
same qualitative pattern as outputs, price levels remain below the equilib- 
rium value of one for all periods, and the common nominal rate of interest 
exhibits a behavior very similar to the uncontrolled case (falling to low values 
in periods one to four, rising up to about 3% later on). All country-specific 
variables show exactly the same time paths for both countries. More detailed 
results are available from the authors on request. 




Fig. 1. Country i’s real fiscal deficit 



As can be seen from the graphs, both fiscal and monetary policies react 
on the negative demand shock in an expansionary and hence counter-cyclical 
way: both countries create fiscal deficits during the first five to six periods and 
surpluses afterwards, and the central bank raises its supply of high-powered 
money during the first six years and reduces it afterwards. This results in 
less output loss and lower defiation than in the uncontrolled solution. What 





443 




Fig. 2. Additional high-powered money 




Fig. 3. Country i’s output-deviation from its long-run equilibrium level 



is remarkable is the small magnitude of the (absolute) values of the instru- 
ments involved: the highest value of the fiscal deficit created is one tenth of 
one percentage point of GDP (in period one), for example, which would be 
nearly invisible in terms of the Maastricht criteria if applied in the European 
Economic and Monetary Union. This is due to the strong self-stabilizing 
forces in the model used, acting especially through the interest rate channel, 
as noted already for the uncontrolled solution. As there is not much need for 
counter-cyclical action, it is not surprising that optimal (equilibrium) policies 
entail only cautious activities. 







444 



0 

0 

0 



a 




-4— Nash FB 
* Nash OL 
Pareto 

-H — uncontToJIed 



Fig. 4. Country i’s debt 



Comparing the noncooperative equilibrium solutions and the cooperative 
solution yields another interesting observation. All show qualitatively the 
same behavior, and the two noncooperative Nash equilibrium solutions are 
very close together in terms of all control and state variables. The collusive 
solution, although not too distant from the other two, exhibits more active 
policy-making (higher fiscal deficits and money creation in the first periods). 
This different policy-mix does not change the path of the rate of interest 
(higher deficits increase, higher money supply decreases the interest rate, 
ceteris paribus), but does so for the public debt trajectory: in the noncoop- 
erative solutions, government debt is increased, in the cooperative solution 
it is increased, reflecting the relatively higher increase of fiscal deficits as 
compared to monetary supply increases in the noncooperative solutions. It 
remains to be seen whether these policy patterns remain under alternative 
assumptions about the economic model or the shock. 



4 Concluding Remarks 

Applying dynamic game theory and the OPTGAME 2.0 algorithm to a simple 
macroeconomic model of fiscal and monetary policies in a monetary union, 
we obtained several insights into the design of economic policies facing a 
symmetric negative demand shock. In particular, optimal policies of both 
the governments and the common central bank are counter-cyclical but not 
very active, at least for the model under consideration. The outcomes of the 
different solution concepts of dynamic game theory are rather close to each 
other. In particular, a periodic update of information and related reduction 
of commitment (a change from an open-loop to a feedback Nash equilibrium 






445 



solution) does not cause benefits or costs to either decision-maker. Coop- 
erative economic policies (both fiscal and monetary ones) are more active 
or “aggressive” than noncooperative ones, resulting in a somewhat different 
policy-mix with higher stabilization effects. Further research will have to 
show how sensitive these results are with respect to the assumptions about 
the model and the shock. 

Acknowledgement 

This research was financially supported by the Austrian Science Foundation 
(project no. P12745-OEK). Any opinions, findings, conclusions or recommen- 
dations expressed in this paper are those of the authors and do not necessarily 
reflect the views of the Austrian Science Foundation. 



References 

1. Ba§ar T., Olsder G. J. (1999) Dynamic Noncooperative Game Theory, 2nd edn. 
SIAM, Philadelphia 

2. Petit M. L. (1990) Control Theory and Dynamic Games in Economic Policy 
Analysis. Cambridge University Press, Cambridge 

3. Behrens D. A., Neck R. (2002) Approximating Equilibrium Solutions of Multi- 
Player Difference Games Using the OPTGAME Algorithm. Working paper, Uni- 
versity of Klagenfurt 





Impact of Feedback Loop on Group 
Decision Process when Applying System 
Dynamics Simulators 



Andrej Skraba and Miroljub Kljajic 

University of Maribor, Faculty of Organizational Sciences, Kidriceva cesta 55a, 
SI-4000 Kranj, Slovenia, e-mail: {andrej. skraba, miroljub.kljajic}@fov.uni-mb.si 



Abstract. Influence of feedback information on group decision process supported 
by the application of system dynamics simulators is determined in present study. 
Experiment with decision groups was conducted under different experimental con- 
ditions: ai) determination of business strategy without application of formal model, 
02 ) determination of the strategy with application of formal system dynamics model 
and as) determination of the strategy with additional application of the group 
feedback information. The hypothesis that model application and group feedback 
information positively influence the convergence of the decision process and yield 
higher values of criteria function was confirmed. Described group decision process 
was subsequently analyzed in order to determine its frequency and deviation. 



1 Introduction and Methodology 

There are several new approaches that expand System Dynamics (SD) method- 
ology regarding exploration and development of SD models as for example 
Vennix’s group model building [1] or application of AI [2]. The methods of 
scenario creation and selection applying multicriteria decision functions were 
developed in our previous research [ 3 ]. The present paper is the continua- 
tion of research conducted in the field of group exploration of SD models [ 4 ]. 
Experiments under different conditions were conducted in order to analyze 
the influence of information feedback and different methods of work in the 
decision process. Simulation system used at the experiments can be described 
with sets M, J and L, where M is the model, J is the set of criteria, and 
L is the set of limitations. The main task of efficient management, which 
participants should perform, is to determine optimal control based on the 
known M, J and L. Fig. 1 shows the model of production as the black box 
with input parameters ui, U2, U3 and U4 (where u\ is Product Price, U2 is 
Salary, U3 are Marketing Costs and U4 is Desired Inventory) and three dif- 
ferent experimental conditions Ui, U2 and 03. Model can be described by the 
state equation: 

x{k + 1) = f{x{k),u{k),ai). (1) 

where x{k) represents the state of the system, u{k) the control vector or 
alternative strategy and ai the experimental condition. 




447 



^ 2 ) ^3 

Ul 
Uz 
U3 
U4 



I 



J 



Fig. 1. Model with Input Parameters at Different Experimental Conditions 



The model is controlled by a user-friendly interface in the form of a busi- 
ness simulator. Participants had a task to promote a product on the market, 
which lifecycle is one year. Criteria function was stated as: 



max J == 
ueu 



do + EfeO d{tj) Wi Po + Ei=oP(^i )^^2 ^ 
^ Oo + Ei=0 



I gp + s{tj) W3 ^ 1^0 + El=0 Htj) W4 
PO + El =0 Pi^i) PO + El =0 Pi^i) 



( 2 ) 



where do is the initial value of Income, d{ti) is the Income function where 
d(ti) == p(ti) — o(ti) where p(ti) is the Revenue function, o(ti) is the Expenses 
function, tk is the final time of observation, c is the Capital, wi^W 2 tWs and 
W 4 are the weight values, po is the initial value of Revenues, oq are the ini- 
tial Expenses, sq are the initial workforce expenses, s{ti) is the workforce 
expense function, vq are the initial inventory costs, and v(ti) the inventory 
costs function. The sum of weights equals to = 1. Actual values of weights 
were determined as the constant factors. The criterion description and weight 
values are shown in Table 1. 



Table 1. Criterions Ji and Weights Wi 



Criterion 


Description 


Weight 


Value of Weight 


Ji 


Capital Return Ratio 


W\ 


0.5 


J2 


Overall Effectiveness Ratio 


W2 


0.35 


Ji 


Workforce Effectiveness Ratio 


W3 


0.10 


J4 


Jnventorj^ Ratio 

Income 


W4 


0.05 



The goal for the participants was to maximize the stated criteria function 
in Eq. (2). The maximum value of the criteria function was determined by 
several optimization methods as J — 1.5. 





448 



1.1 Decision Task and Hypotheses 



The decision task which was addressed by applying SD simulator was stated 
as follows: Find the appropriate values for Product Price, Salary, Marketing 
Costs and Desired Inventory Level in order to achieve the maximum possible 
value of criteria function in Eq. ( 2 ). The time of conducting the experiment 
with decision groups was | hour for all three conditions, which can be de- 
scribed as follows: 

Experimental condition a\ assumes the individual assessment of the deci- 
sionmaker at the determination of the model parameters values ui, U2, U3, U4 
through maximization of the criteria function in Eq. (2). The decision is made 
without a formal model. 

Participants at the experimental condition 02 search for the parameter 
values , U2, U3, U4 with the aid of the SD model through maximization of the 
criteria function. The interaction between subjects is not supported; therefore 
group feedback is not applied. 

Experimental condition 03 assumes application of the SD model by partic- 
ipants with group feedback information. Time of conducting the experiment 
under this condition was divided into four time intervals, (8 + 84-8 + 6) min. 
Each participant put the best-achieved set of parameter values ui,U2,us,U4 
to the network server where after each time interval the information about 
best parameter values is updated. The system provides feedback informa- 
tion to all participants regarding the individual best-achieved strategies of all 
other participants in the experiment. Participants get feedback on the defined 
strategies for all participants in group Si = {uu, U2i,Usi,U4i} ; i = 1 , 2 , . . . ,n 
as well as the aggregated values in the form of mean values of parameters. 
Mean values of parameters u are expressed as follows: 




E n 

i=l 



Usi 



n 



E n 
i=l 

n 



( 3 ) 



where S is the set of mean parameters’ values and n is the number of 
participants. Such functional mapping could take many other forms, not only 
arithmetical mean [ 5 ]. 

The following statistical hypotheses about different experimental condi- 
tions were stated: a) Hypothesis 1 : There is significant difference in the results 
of the decision process conducted under experimental condition ai ( individual 
assessment of the decision-maker at the business strategy determination with 
no formal model), and condition 02 (individual application of SD model at 
the determination of the business strategy), b) Hypothesis 2 : There is signif- 
icant difference between the results of the decision process conducted under 
experimental conditions 02 and as (application of the SD model with group 
feedback information). 





449 



1.2 Experimental Procedure 

One hundred and forty seven senior graduate students (68 females and 79 
males) from the University of Maribor participated in the research in order 
to satisfy the requirements of their regular study program. Subjects were 
introduced to the experimental problem of determining a business strategy 
according to the stated criteria function. Duration of introduction was 10 
minutes. The structure of the considered system was presented and the main 
parameters in the model were explained. The evaluation criteria for the busi- 
ness strategies were also considered. The work with the simulator was ex- 
plained at experimental conditions and as. After the introduction every 
participant formed a strategy according to the stated problem and passed 
these results to the network server. 

2 Results 

The results of the decision process conducted under experimental conditions 
Q>i {N\ = 52), U 2 (A ^2 = 55), and as {N^ = 40) are shown in Fig. 2 The values 
of criteria function J in Fig. 2 are ordered from highest to lowest (T-axis). The 
X-axis shows the relative number of participating subjects. Results gathered 
from experimental condition a\ are the lowest; their average is M = 0.446. 
Average results gathered under experimental condition a2 are higher than 
the results gathered under experimental condition ai (M = 1.076), while the 
standard deviation is smaller (cr = 0.317 for U2 and a = 0.498 for ai). The 
highest values of criteria function were gathered under experimental condition 
as {M = 1.386), while the deviation was the smallest (cr = 0.073). 

Mann- Whitney nonparametric [/-test was applied for the comparison of 
experimental conditions ai, U 2 and as. Calculated values of [/-test for three 
stated hypotheses were U{Ni = 52,^2 — 55) = 410, p < 0.01 for HI and 
U{Ni = 55, X 2 = 40) = 370, p < 0.01 for H2 therefore, both stated hypothe- 
ses are accepted. According to the tested hypotheses, it can be concluded 
that significant differences exist between the results of the decision process 
conducted under different experimental conditions. The main emphasis is on 
the difference between experimental conditions U 2 and as, where SD models 
are used without and with group informational feedback. These two groups 
have the same technical means to address the stated decision problem. 

In order to analyze frequency and deviation of group decision process spe- 
cial network tool was developed which enabled us to sample the whole process 
with [ms] accuracy. Every action of individual participant was recorded to the 
server database which also enabled the mediation of feedback information. 
The post experimental test was conducted with N = 30 subjects, which were 
also senior graduate University students. Left part of Fig. 3 represents the 
frequency of group decision process. There are two experimental conditions 
shown, U 2 and as - The frequency of process under condition as is lower due to 
the fact that additional processing of feedback information was performed. 





450 




Relative Number of Subjects 



Fig. 2. Values of Criteria Function J for Different Experimental Conditions 



This is evidently represented by oscillation and four peaks in performance 
which coincide with experimental phases of condition as. Therefore it could 
be concluded, that less processing of information was done i.e. tests on the 
simulator under condition as but the feedback information contributed to the 
right orientation of decision group. 



ft a/ j t 

0.7 

0.6 
0.5 
0.4 
0.3 
0.2 
0.1 
0.0 

0 300 600 900 1200 15001800 0 300 600 900 1200 1500 1800 

time [s] time [s] 

Fig. 3. Frequency (left) and deviation j (right) of strategy search process for con- 
ditions 02 and as. 




Right side of Fig. 3 shows the deviation of the process expressed as the 
( j). This ratio shows that the deviation in initial phase of experiment is 
high in both cases a 2 and as but later on, from time approx. 300 s stabilizes. 
The search space in both cases is thereon similar, in case as slightly more 





451 



perturbed on account of several stops at each experimental phase. In both 
cases the deviation gets smaller as the time progresses since the search space 
of optimal strategy is narrowly determined. 

One could expect that cumulative frequency in the decision process under 
experimental condition will be lower on account of the subjective informa- 
tional processing i.e. /[asjdt < f[a2]dt. This determine the condition 
of information utility U which should be higher in the process where feedback 
is considered U{I[as]) > U{I[a2]) if J2 = J3. 

3 Discussion 

Statistical analysis indicate that group feedback information significantly im- 
pacts group decision process supported by the SD models. The infiuence of 
information feedback results in higher convergence of the decision process. 
Feedback information is therefore the main component in the efficient decision 
support system based on the SD models. Results of experimental condition 
ai where no formal model was applied were an indicator of formal model 
importance.The effect of increasing system efficiency under three different 
conditions can be explained with the introduction of additional information 
into the system, which is the main component of learning and system’s con- 
trol. Information introduced by formal model as well as introduction of group 
feedback information contributed to the higher criteria function values and 
convergence. Conclusion that more feedback contributes to the better results 
would not be appropriate. Subjective limitation of information processing at 
designing feedback must certainly be considered. The question, which arose 
at this stage, is how to aggregate feedback information and mediate only the 
information, which is decisive for certain case. Mean values as the simplest 
example was used in our case and it was shown to be appropriate. Feedback 
information which is applied in similar decision environments should meet the 
conditions of utilization in order to positively enhance group decision process 
supported by the SD simulators. Feedback information mediated to the sub- 
jects was in our case the basis for faster solution of stated decision problem. 
It should be: a.) mediated to the group and b.) properly aggregated in order 
to prevent information overload of subjects and achieve goal seeking behavior. 

Acknowledgement 

This research was supported by the Ministry of Education, Science and Sport of the Republic 
of Slovenia (Program No. PP-0586-501 and Project No. Z5-3313). 



References 

1. Vennix, J. A. M. (1996) Group model building: facilitating team learning using 
system dynamics. Chichester Wiley 

2. Hines, J., House, J. (2001) The source of poor policy. System Dynamics Review 
17, 3-32 

3. Kljajic, M., Bernik, I., Skraba, A. (2000) Simulation Approach to Decision As- 
sessment in Enterprises. Simulation 75, 199-210 

4. Skraba, A. (2000) Multicriteria Group Decision Making with System Dynamics 
Models. Doctoral Thesis. Kranj - SI, University of Maribor 

5. Zeleny, M., (1982) Multiple Criteria Decision Making. New York Me Graw-Hill 





The Management Game SINTO-Market - 
Report on Some Recent Experiments 



Otwin Becker^, Tanja Feit^, Vera Hofer^, Ulrike Leopold-Wildburger^, 
Susanne Lind-Braucher^, Jorg Schiitze^, and Reinhard Selten^ 

^ Alfred Weber Institute University of Heidelberg 

^ Department of Statistics and Operations Research, Karl-Pranzens-University 
Graz, tanja.feit@uni-graz.at 

^ Laboratory for Experimental Economics, Priedrich-Wilhelms-University Bonn 



Abstract. The management game SINTO-Market which was originally developed 
and programmed by Otwin Becker and Reinhard Selten was recently performed at 
the University of Kalmar /Sweden and University of Graz/ Austria several times. 
The decisions were made by groups of students, each group representing one of the 
three firms. The electronic management game SINTO-Market puts the players in a 
competitive situation in the branded food product sector. 

We will give a short description of the structure of the game. The game runs 
via internet and can be seen at the web-page of our department. 

It is interesting to see that much can be learned about human behavior in 
complex economic decision situations from experimental research with management 
games. Surprisingly the results show some differences between genders. 



1 Description of the Game SINTO-Market 

Within a special food product sector, managers have to find the most suc- 
cessful strategies for production, prices and advertising expenses for up to 10 
brands, as well as a suitable three-dimensional combination of brand compo- 
nents concerning their form, taste and quality. 

SINTO is a synthetic protein product which does not exist in reality, 
although it may be developed at some time in the future. Experience gained 
in one sector cannot simply be transferred one to one to other sectors. Before 
beginning the game, the players receive a short game description. 

The synthetic protein product SINTO was developed almost simultane- 
ously by three companies. Each parent company has founded a subsidiary, 
further on referred to as the firm, which is given the task of producing and 
selling this new foodstuff. At the beginning of the game, each firm should 
bring only one brand onto the market. Then, as demand for the product 
grows over time it is worth offering two or more brands which differ in taste 
and packaging design. Each firm must decide how many brands to bring onto 
the market. Brand characteristics, production, price and advertising expendi- 
ture have to be fixed for each brand. The sales quantities are then calculated 
from this information. Ail brands are in competition with each other so that 




454 



the sales of one brand are also dependent on the decisions made for all other 
brands. 

The goal of each firm is the maximization of final owned capital. 

At the beginning of the game all firms are in the same situation. Each firm 
has 10 registered brand names available. The choice of the brand name does 
not have any influence on the sales of the brand. Each firm must establish 
the following choices for each brand offered on the market: 

1. The combination of the three brand characteristics 

a) taste (bitter - mild), 

b) form resp. the grainedness (fine - rough ) 

c) quality according to the wrapping design (high - low). 

2. The production quantity. 

3. The amount of expenditure assigned for advertising. 

4. The price. 

5. The capital investment. 

The choice of form and taste has no influence on the manufacturing costs, 
whereas the choice of the wrapping quality does. The wrapping design has a 
similar effect on sales as advertising. 

The sales of a brand do not only depend on the decisions made concerning 
this brand but also on the decisions made for all the other brands. The 
demand for brands is not dependent on which firm produces the brand. The 
buyers only consider the brand names, not the manufacturer. 

The demand for a single brand is greater, when there are less brands with 
similar taste characteristics on the market. A brand sells more, the higher the 
quality of its packaging design is in comparison to the other brands. Sales are 
larger, the lower the price is and the higher the prices of the other brands are. 
Sales of a brand can also be increased by spending more for advertising. The 
sales of a brand decrease if the advertising expenses for other brands increase. 
The competition between two brands is stronger, if their taste characteristics 
are similar. However, this is only the case for taste characteristics and not for 
price, packaging design and advertising. As regards these three variables, each 
brand competes with all the other brands, independent of their combination 
of taste characteristics. 

The results of each period are given to each firm at the end of that period. 
This market overview is a print-out which gives information about the brands 
offered in that period together with their taste characteristics, and also about 
the following items: 

the profit and loss account, 

the turnover-based contribution cost analysis, 

the production-based unit cost analysis, 

the number of sales, 

the advertising expenditures, and 

the prices of all brands offered in a period. 





455 



Each firm should aim to attain the highest possible net assets by the end 
of the game. The amount of net assets equals the owner’s capital (accord- 
ing to the latest balance sheet) plus the subsidiary’s share of the activated 
’’brand images” (that means net assets = owner’s capital plus still effective 
advertisment). The latter is the value of money which represents the effect of 
the advertising if the game were to be continued. Due to this, it is still worth 
investing in advertising expenditure even in the last few periods of the game. 



2 Results 

The SINTO-Market was played once with 9 students at the University of 
Kalmar (Sweden). The next experiences with the electronic management 
game SINTO were made with students of Graz-University and documented 
in the following Table 1. We will show some interesting results. 



Table 1. Composition of students at Graz-University 



Game 


female 


male 


Firm 1 


Firm 2 


Firm 3 


818 


3 




1 


1 


1 


222 




3 


1 


1 


1 


111 


7 




3 


2 


2 


444 




7 


2 


2 


3 


777 


9 




3 


3 


3 


999 




9 


3 


3 


3 


666 


10 




6 


3 


1 


333 




11 


3 


1 


7 


Sum 


29 


30 









The success of each firm depends on the following strategies: 

SI: A rapid increase of production capacity. 

S2: An efficient protection of a good market position, combined with 
the relative standing of the others. 

S3: The allocation of an appropriate amount of advertising expendi- 
tures. 

Here we want to show in more detail some results. Obviously the price 
is responsible for the consumers behavior and the success of each firm. We 
figured out the following findings: 

FO: Fundamental Finding: 

Female participants differ significantly from male participants. 

FI: Females set prices lower than males. 





456 



950 
900 

S 850 

k_ 

a 

S) 80Q 

I™ 

700 

650 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

Period 

Fig. 1. Average price 




The diagram of Figure 1 shows the average prices during the 15 periods. The 
high degree of statistical significance for the difference in the levels of male 
and female prices can be proved by several test. E.g. Friedman-Test on 0.01 
level. 

F2: The investment expenses of male subjects are higher than those of female 
groups. 

400.000 
350 000 
I 300.000 

I 250.000 

c 200.000 
& 150 000 

IQ 

I 100.000 
< 

50.000 
0 



-femai© 
' — ■ — male 




Period 



Fig. 2. Average investments 









457 



The diagram of Figure 2 shows the average investments during the 15 
periods. Significant differences in the investment behaviour of male and fe- 
male groups from period 5 up to period 12 can be shown on the level of 0.01 
(eg. Friedman-Test). As in those periods the demand is growing, male groups 
profit from their higher investments. 



F3: The net assets of male groups are higher than those of the female groups. 




^ - -femai© 
— male 



Period 



Fig. 3. Average net assets 



The diagram of Figure 3 shows the average net assets during the 15 pe- 
riods. Whereas in the first half of the game male and female groups have 
similar average net assets, average net assets of male and female groups dif- 
fer on a significance level of 0.01 in the second half of the experiment. 

Statistically it can be shown, that in the end of the game (after 15 peri- 
ods), the male groups had a significantly higher average net asset than the 
female groups, (significance level 0.01), which means that male groups were 
significantly more successful than female groups. 

Our final findings can be summarized in the following way: Female parti- 
pants in the management game SINTO-Market are more cautious than their 
male competitors. Therefore their net assets and their profits are lower than 
those of male students. 







458 



References 

1. Becker O., Selten R., (1970) Experiences with the Management Game SINTO- 
Market, 136-150. In: Sauermann, H. Contributions to Experimental Economics. 
Volume two. J. C. B. Mohr (Paul Siebeck) Tubingen 

2. Becker O., Selten R., Leopold U., Lind S., (2001) The Management Game 
SINTO-Market revieved. Working Paper, Institute of Social Sciences and Man- 
agement, University of Kalmax, Sweden 

3. Blinder A. S., Canetti E. R. D., Lebow D. E., Rudd J. B., (1998) Asking About 
Prices. A New Approach to Understanding Price Stickiness. Russel Sage Foun- 
dation, New York 

4. Selten, R. (1967) Investitionsverhalten im Oligopolexperiment, 60-102. In: 
Sauermann, H. Contributions to Experimental Economics. J. C. B. Mohr (Paul 
Siebeck) Tubingen 

5. Selten, R. (1967) Ein Oligopolexperiment mit Preisvariation und Investition, 9- 
59. In: Sauermann, H. Contributions to Experimental Economics. J. C. B. Mohr 
(Paul Siebeck) Tubingen 

6. Selten, R. (1967) Die Strategiemethode zur Erforschung des eingeschrankt ra- 
tionalen Verhaltens im Rahmen eines Oligopolexperimentes, 136-168. In: Sauer- 
mann, H. Contributions to Experimental Economics. J. C. B. Mohr (Paul 
Siebeck) Tubingen 





Bounds & Likelihood Procedure Revisited 



Becker, O., Leitner, J., Leopold-Wildburger, U., Schuetze, J. H., 

BECKER, Otwin, 

Alfred Weber Institut, Universitat Heidelberg, Grabengasse, D-69117 Heidelberg. 
o.becker.walldorf@t-online.de 

LEI'T^R, Johannes; LEOPOLD-WILDBURGER, Ulrike; 

SCHUTZE, Jorg Hermann, 

Institut fur Statistik und OR, Universitat Graz, Universitatsstrasse 15, A-8010 
Graz, ulrike.leopold@uni-graz.at . 



Abstract. In an experiment subjects are asked to predict the next value of an 
univariate time series on the basis of the past observations. The average forecasts 
of the subjects can be well described by a surprisingly simple rule, which is called 
the “Bounds and Likelihood procedure”(Becker/Leopold 1996). In a new series of 
the experiment, conducted 2002 in Graz with 72 participants, the behaviour and 
score of the subjects is further analysed with the Selective Attention Test (subset 
of D2, Brickenkamp 1962) and Intelligence Test for Reasoning (subset of PSB, 
Horn 1969). 

Keywords: Experimental economics, forecasting, visual extrapolation, eye-balling, 
time series, concentration, intelligence, attention, gender. 

JEL-classification: E27, D81, D7, C53, C92 

1 . Introduction 

For a long time economists have shown increasing interest in the analysis of 
expectation formation. Following the approach of I. Fisher (1896, 1930) and F.A. 
Hayek (1948) a lot of macroeconomic theories have been developed which are at 
least partly based on variations in expectations. 

There are two major approaches to forecasting: conjectural and structure-based 
procedures. Visual extrapolation ('eye-balling') belongs to conjectural procedures. 
The Bounds and Likelihood (B&L) procedure (Becker/Leopold 1996) is a 
heuristic to describe expectation formation in regard to average forecasts. In an 
experiment, subjects are asked to predict the next value of an univariate time 
series (see Figure 1). To make a forecast, the task of the respondents, depending 
solely on their visual processing of the information, then consists of extending the 
respective curve on the chart beyond the given endpoint. We focus on cases with 
limited information for the subjects. The only available information for the 
decisions are the past observations. No additional information about the meaning 
of the series and influential factors were given to the participants in the 
experiment. This reflects common cases from real economic life. 




460 



We analyse the individual behaviour in comparison to the expected group 
behaviour according to B&L. Therefore the B&L average forecast prediction is 
used to access and analyse individual forecasting behaviour. 

We link the forecasting performance to certain measures of attention and fluid 
intelligence with the help of the Selective Attention Test (Test d2, Brickenkamp 
1962) and Intelligence Test for Reasoning (subset of PSB, Horn 1969). 

2. Description of experiments 

The subject’s task is the forecasting of a time series Xt over 42 periods. The first 6 
periods serve as the phase of familiarisation. The time series is a realisation of the 
stochastic difference equation 



Xt = Xt_i - INT(0,5 X Xt_2) + Uj 



with the endogenous variable \ and the white noise u,. The variable u, is uniformly 
distributed in the interval [1,6]. Consequently the random variable’s realisations 
are integers. The B&L procedure is described in the appendix. 

The subjects were given the following information: The time series can be 
interpreted arbitrarily, as share price, economic cycle or commodity price. 

The gestalt of the time series can be seen more clearly from period to period. It is 
the only relevant information for the forecasting process. The subjects’ forecasts 
do not have an influence on the progression of the time series. 

The subjects had a strong financial incentive to forecast the time series correctly. 
They were paid 60 Eurocents for an exact prediction. A deviation of one (two) 
unit(s) was remunerated with 40 (20) Eurocents. The payments correspond to a 
function of prediction-errors based on absolute deviations, that is cut off to zero at 
the value of 3. 

Figure 1 shows the subjects average forcasts, the B&L-values, the B&L errors in 
predicting the average opinion and the time series Xt. 




subjects 

B6(L 



8^(L error 
ut 



Figure 1 : B&L and average opinion 







461 



Altogether 72 students of the faculty for social sciences and economics at Graz 
University took part in the experiment. 35 of the participants are male and 37 
female. 

3. Results 

We do not take into account the first 6 periods since we regard these periods as a 
training time for the subjects. We use the remaining 36 periods to form four 
segments consisting of 9 periods. Figure 2 shows the small deviation error in 
regard to the B&L average forecast. 





1 Segments | 




1 


2 


3 


4 


All 


Period 


Per. 7-15 


Per. 16-24 


Per. 25-33 


Per. 34-42 


Per. 7-42 


B&L sum 


0,8 


1,03 


0,85 


0,67 


0,84 



Figure 2: Average forecasts and deviation from B&L 



We analyse the average error, distinguishing between gender. There are some 
interesting results: 

As shown in the following boxblots of Figure 3a - 3d the forecasts of the female 
and male subjects show an interesting difference. The female participants show a 
higher deviation from B&L in the first segment while the male participants show a 
higher deviation in the last segment. 





gender 



gender 

Figure 3a and 3b: Deviation from B&L in segment 1 and 2 








462 





gender gender 

Figure 3c and 3b: Deviation from B&L in segment 3 and 4 

We further analyse the results in regard to the intelligence tests: But we cannot 
reject the null hypothesis, that the applied tests are independent of the 
performance of the forecasts in regard to gender. 

The turning points offer another point of investigation. These turning points are 
one of two main components of the B&L procedure: The extent of difference 
between the given period and the next period to be forecasted about the trend of 
the time series. The analysis concentrates on the turning points predicted by B&L 
and not on the turning points of the time series. There are 12 B&L turning points 
in the four segments covering the periods 7 to 42. The subjects score lies between 
3 and 10 forecasted B&L turning points. Figure 4 illustrates how many turning 
points could be predicted correctly by the 72 subjects. 




Figure 4: Correctly forecasted B&L turning points 

According to Figure 5 female participants show better scores than their male 
counterparts. While females in average score about 8 turning points, males 
achieve at least one less. 







463 




gender 

Figure 5: Correctly forecasted B&L turning points 



Analysing the results with a U-test, we cannot reject the null hypothesis, that the 
score is independent of gender. The fourth segment, however, yields a significant 
difference between female and male participants (Z=- 1,866, P=0,62, two tailed). 

In regard to the turning points, the B&L procedure shows excellent results in 
predicting the behaviour of a subgroup. In average, 2/3 of the turning points are 
correctly anticipated. We could not find any link between the forecasting 
behaviour and the psychological tests applied. 

Literature: 

Becker, O., Leopold-Wildburger, U. (1996) The bounds & likelihood- 
procedure - A Simulation Study concerning the efficiency of visual forecasting 
techniques: Central European Journal of Operations Research and Economics 4, 
Iss. 2/3, 1996, 223-229. 

Brickenkamp, R. (1962) Aufmerksamkeits-Belastungs-Test (Test d2). Gottingen, 
Hogrefe. 

Fisher, I. (1896) Appreciation and Interest. New York. 

Fisher, I. (1930) The Theory of Interest. New York. 

Hayek, F.A. (1948) Individualism and Economic Order. University of Chicago 
Press. 

HORN, W. (1969) Priifsystem fur Schul- und Bildungsberatung PSB. Gottingen, 
Hogrefe. 

Selten, R. (2001) What is Bounded Rationality, in Gigerenzer & Selten (Eds.), 
Bounded Rationality, MIT Press, 26-27. 





464 



Appendix 

The bounds and likelihood procedure (Becker & Leopold, 1996): 

With the B&L procedure we want to forecast the subject’s prediction of a time-series Xt 
(t=l,2,...,42) for the next following period t+1. The value of the following period is Xt+i. 
The forecast value for the following period we will call f..i- Our forecasting technique is 
based on the time series’ gestalts-characteristics. For period t+1 we start at the actual value 
Xt and consider the prior values of the time series as follows: 

(1) = X, +b, . (1-2.1,) sign(x, - X, , ) 

Obviously the estimated values bt for the average variation within the time series are 
and the turning point probabilities It 



( 3 ) 1 . = 



If Xt > Xt_i which means an upswing case. It is the propability that Xt is a upper turning point 
(peak). For description cases A-C: 

A) all prior peaks above Xt 

It -+ 0 ft-H = Xt + bt 

B) as many prior peaks above and below Xt 

It = 1/2 ft.i = xt 

C) all prior peaks below Xt 

It — > 1 ft-H = Xt - bt 

The coefficient bt serves as a bound in both directions for the maximal predicted variation, 
bt as explained in equation (2) is an estimator of the average variation of the time series. 
The relative frequency of the turning point forecasts is is given as It and defined as follows: 
If the time series increases from period t-1 to period t, then It is the relative frequency that a 
decrease in the base series is forecasted. If, on the other hand, the time series decreases 
from t-1 to t, then It is the relative frequency for forecasting an increase of the base series. 

Therefore, the coefficient ct varies within the interval [-bt,bt ]. In the extreme case there is a 
total uncertainty whether the time series will rise or fall and then, the naive prediction ft+i 
= Xt should be expected. 



1 + number of localm 20 iima< 



2 + number of localmsx ima 
1 + number of localmmima > 



2 + number of localmmima 



forx, > x,_i 
forx, < 





On the Allocation of Excesses of Resources in 
Linear Production Problems 



F.R. Fernandez^ G. Fiestras^, I. Garcia- Jurado^, and J. Puerto^ 

^ Universidad de Sevilla, Facultad de Matematicas, Spain, fernande@us.es 
^ Universidad de Vigo, Fac. Ciencias Economicas y Empresaxiales, Spain. 

^ Universidad de Santiago, Facultad de Matematicas, Spain, ignacio@zmat.usc.es 
^ Corresponding author, puerto@us.es 



Abstract. In this paper we consider non-centralized linear production situations. 
In one of those situations, each producer i of a set N has an optimal production 
plan Xq for a linear production problem given by max{c^x* : A^x'' < b\x' > 0}. In 
order to majcimize the benefits, the producers decide to take profit of their excess 
of resources {b^ — A^Xq^ . . . ,6^^ — A^Xq). We study the games which describe this 
situation when players cooperate and side payments are possible, when players do 
not cooperate and when players cooperate and side payments are not possible. 



Key words. Linear production models, linear programming, Nash equilib- 
rium, core. 

1 Introduction 

Let us consider a set N = n} of producers. For each agent i, i = 

1, . . . , n, let us denote by A* G G , and 6* G W the technological 

matrix, the unit selling price vector and the resource bundle, respectively. We 
assume that there exist linear production technologies, the optimal individual 
production policy for agent z,z = l,...,nis obtained solving the problem: 

max 

s.t. : z = 1, . . . , n, (P*) 

a:* > 0, z = 1, . . . ,n. 

Let Xq be an optimal solution of (P^) and 6 q the vector of resources consumed 
to implement the optimal production plan Xq, i.e. — A^Xq. For any subset 
5 C iV, let us denote b(S) = ^o(S') = Ylies ^o* 

We assume that all the producers must maintain their production capa- 
bilities and then, they want to allocate their excesses of resources in order 
to improve benefits. We propose two ways of analyzing the allocation of the 
excesses. One of them assumes that producers act cooperatively. In this case, 
the agents within each coalition ensure their individual production plans and 
then they share their excesses in order to improve the summation of their 
benefits. The goal in this cooperative analysis is to identify some core allo- 
cations that can be realized by assignments of excesses. In the second way. 




466 



producers behave non-cooperatively. This means that each agent indepen- 
dently requests a portion of the total excess of resources. The goal is to 
identify those claims leading to a rest point, and to establish the relationship 
between the payoff of some Nash equilibria of the non-cooperative case with 
the imputations of the cooperative situation. 

2 The cooperative analysis 

Given the production situation of n agents described above we assume that 
producers may cooperate. In this context cooperation means that agents 
within a coalition S can only distribute among themselves their total sur- 
plus in order to improve their joint benefits. In this way, we define the next 
game {N,v) in characteristic form, where u(0) = 0, and for any S C N 

v{S) = max Ylies 

s.t. : <bi,ie 5, 

Eies^'<KS)-bo{S), ^ 

>0, >0, i e S. 

We note in passing that even in the case where all the c* — c, — A, 
i = 1, . . . , n, the game above differs from the classical linear production game 
{N,w) introduced by Owen (1975) [2]. (See [1] for an example that proves 
this claim.) In Owen’s model, the production is centralized by one of the 
producers and the benefits are later allocated to the producers. Recall that 
in that game: w{S) = max{c^ : Az < 6(5), 2 : > 0}. 

In general, if = c, = A, i = 1, . . . ,n, holds, it is clear that v{S) < 
w{S), V5 C N. This relationship does not hold in general when the techno- 
logical matrices and price vectors are different. 

Example 1. Let us consider the production system given by N = {1,2}, 

= ( 2 , 6 , 3 ), 61 = (7,1), = ( 2 , 4 , 6 ), 6 ^ = (1,5) and A = (JJ j) ■ 

Next table below collects a pair of optimal solutions and their corresponding 
optimal values, individual consumptions and excesses of resources. 





pi 


P2 


optimal solution (xq) 


(0,1,0) 


(0,0,1) 


optimal value {cxq) 


6 


6 


consumption (6 q) 


(2,1) 


(1,3) 


surplus (6® - 6 q) 


(5,0) 


(0,2) 



In our approach, u(l, 2) is given by the optimal value of the problem 

2 

max ^ &x^ 
i=l 

S.t. : Ax^ - < 6q, 2 = 1, 2 . 

el +£2 <(5,2), 

>0,ei >0,i = l,2 





467 



Its optimal value is 25.2, and an optimal solution is given by = 

((0, 3, 0), (0, 0.6, 0.8), (4, 2), (1, 0)). If we def. w{l^ 2) = max (1, 2), u^(l, 2)}, 
where v^(l,2) = max{c^x : Ax < (8,6) , x > 0}, and v^(l,2) = max{c^x : 
Ax < (8,6), X > 0}, we get v^{\,2) = 24 and u^(l,2) = 19.2. Then, 
u;(l,2) = 24<?;(l,2). 

In the game {N,v) the imputation set, 

I{v) = {r 6 E” : ri = v{N), u > c^, Vi e N). 

ieN 

In particular, given (e^, . . . ,^) and (x^, . . . ,x^) an optimal solution of P^, 
it is clear that {c^x^,. . . , c^x^) is a n imputation of {N, v). 

Let us denote by the dual problem of for any coalition S C N. 
This is: 

min y% + 7(^(5') - bo{S)) 
s.t. : y^A^ >c\ieS, 5 

-2/' + 7>0, 

2/* > 0, 2 € 5, 7 > 0. 

Let {y~N^j), with y^ = (^jv> • • • optimal solution of Problem 

(D^). This solution allows us to characterize a family of imputations of the 
game (iV,t?). 

Proposition 1. Let (^^, . . . ,f^) be an allocation of excesses satisfying 
^ — b{N) - bo{N), > 0, for any 2 = 1, . . . , n. Then, 

r = {Vn^I + 7N^, • • • , Vn^o + 7jv^) € I(v). 

Proof. Since (^iv,7) is an optimal solution of then 

n n n n 

v(N) = ^ + 7(6(iV) - bo{N)) = Y, fNbi + 7 E + 1^)- 

i=l i—1 i—l i=l 

Moreover, {yp^^jN)i is a feasible solution of D% the dual problem of F® and 
thus, by the weak duality theorem y%bQ -f > v{i)- Since this argument 
does not depend on the particular choice of 2, then f is an imputation of the 
game {N,v). □ 

In any case, we can characterize those imputations f = (fi, . . . ,fn) that 
can be reached through allocations of excesses. Indeed, f can be reached if 
and only if Problem (P): 

max c'^x^ 

s.t. : A^x^ < 2 € AT, 

Er^i£‘< W-MiV), (P) 

dx^ < Ti, 2 = 1, . . . ,n, 

> 0, > 0, 2 = 1, . . . ,n, 





468 



has an optimal value of v{N). 

We note in passing that reachable imputations are rare because they must 
hold the condition above. A special case of allocation introduced in Proposi- 
tion 1 gives us a core element in the game {N,v), 

Proposition 2. The imputation r = ~^o)> • • • ? 2/]v^o 

6q) where (yN^jN) is an optimal solution of {D^) belongs to the core of the 

game (N,v). 

Proof. It is sufficient to prove that for any S C N then Xlies 
^o) ^ ^('5')* Indeed, this inequality follows because (i/iv,7iv) is a feasible 
solution of {D^), the dual problem of (P^). □ 

Notice that the game (AT, v) is balanced. Since the class of balanced games 
coincides with the class of linear production games, there exists a linear pro- 
duction game that induces the game (N^v). In this case the construction is 
straightforward in terms of the elements of the production system (technolo- 
gies, prices ad resources.) 



3 The non-cooperative analysis 



In this section, we analyze our model from a non-cooperative point of view. 
Assuming that the excesses of the different producers are available in a com- 
mon bundle and that the agents behave non-cooperatively, their interaction 
can be modeled as the following n-person non-cooperative game in strategic 
form. For each agent i, i = 1, . . . ,n, its set of pure strategies is given by 
eW : 0 < < b{N) — bo{N)}, and given a profile of pure strategies 

, . . . , the payoff function is: 






max{c*x^ : > 0 } 

if W-bo(AT), 

ieN 

c^Xq otherwise. 



In order to obtain rest points of this conflict situation we look for Nash 
equilibria of this game. 

Theorem 1. Any profile such that ~ ~~ 

i = 1, . . . ,n, a Nash equilibrium. 

Proof. Let us take (£:\ ^o{N). If we assume 
that agent j deviates and chooses ^ , two situations may 

occur: 1) ^ “ ^o(N) and not equal; or 2) there exists a 

component, say fc, such that > K^)k - bo{N)k- In the latter 

case, K^{e~^,s^) = cPqPq < . . . ,e^). In the former case, since any 

feasible x^ in the definition of is feasible in then 

K^{e~^ Therefore, (e:^, . . . is a Nash equilibrium. 

□ 





469 



Remark 1. In general null requests of all the producers is not a Nash equilib- 
rium. On the other hand, any profile in which some players ask for too much 
is a Nash equilibrium. (Nevertheless, these Nash equilibria are not really 
interesting.) 

Another interesting property of this model is that the set of undominated 
payoflF equilibria is non-empty. Moreover, our next result characterizes the 
Nash equilibrium profiles in pure strategies whose corresponding payoff vector 
are undominated. 



Theorem 2. For any payoff undominated equilibrium there ex- 
ists (a :^ , . . . , such that , . . . . ,x^) is a Pareto- solution of 



max . . . ,c”'x®) 

s.t. : A^x'^ i e N 

x^ > 0, 6* G S\ Vi G N. 



(UNP) 



Conversely, for any Pareto-solution , . . . , . . . , x'^) of Problem {U NP) 

then is a undominated payoff Nash equilibrium. 

The proof is straightforward from the definition of undominated payoff Nash 
equilibrium and Pareto-optimal solution of a vector-maximum problem. 

The links between the non-cooperative and cooperative analysis is stated 
below. 

Proposition 3. If {x,e) is an optimal solution of (P^) then e is an undom- 
inated payoff Nash equilibrium of the non-cooperative game andK'^{e) = 
i = 1, . . . ,n. 

Proof. The proof follows because (x, e) is an optimal solution of the scalar- 
ization of {UNP) with the scalar weights A = (l,...,l). □ 



4 The non-transferable utility cooperative analysis 

In this section we consider the production situation from a cooperative point 
of view, like in section 2, but assuming that side payments are not possible. 
Thus, an NTU game should be used to model this situation, more precisely, 
an NTU market game (see, for instance, Owen (1995) [3] for details on market 
games). We again assume that each producer has an optimal production plan 
(xq being the optimal production plan of agent i). So, the original excess of 
resources of every player i G AT is 6* — 6 q. If a coalition S forms, players can 
redistribute their endowments in any way desired, i.e., they can obtain any 
tuple {e'^)i^s such that: 1) ~ ~ ^o{S), and; 2) e* > 0, for all 

i G S. Any such tuple is called a feasible allocation for 5. 





470 



Then, for every S, F(5) is the set of all 2 : 6 1” for which there exists (e*)ies, 
a feasible allocation for S, such that Z{ < Uj(e*), where 

Ui{e*) = max{c*a:* : A^x' < i>o + ^ alH € 5. 

It is immediate to check that, defined in this way, the functions Ui are concave 
(just note that Ui{e) = min{(6j+£0?/* : 1/’^* > ■ J/‘ > 0}), continuous and 

non-decreasing, so the core of {N, V) is non-empty, a competitive equilibrium 
exists and it provides a core allocation. 

An interesting subset of core{N, V) is the following set, that will be called 
subcore{N, F): 

{zePB(y{N)) : {zi)ies>{yi)i€S, for all y€F(5) and all 5 CAT}, 

where PB{V{N)) denotes the Pareto boundary oiV{N). Next we provide a 
characterization of the non-emptiness of subcore{N,V). For any z G V{N), 
we define the family of TU-games {{N,vi), j 6 N} such that u|(AT) = zj 
and, for any non-empty S C N, 

ifj^5. 

Theorem 3. It holds that subcare{N, V) ^ 9 if and only if there exists z € 
V{N) such that all the TU-games (iV,u|), j £ N, have a non-empty core. 

Proof. Assume that the games {N,vi), j G N, have a non-empty core. 
Then, for every j G N, there exists z-’ = (zi,...,zl) a core element of 
{N,vi). Hence, for any S C N and any j G S, 

% = 53 H - yp 

ieN ies 

for all y € V{S). Thus, there must exist 2 : € subcore(N^V) with z > z. 

j 

Conversely, take z G subcore{N^ V) and define z^ = (0, 0). Clearly 
P e core{N,vi). 

References 

1 . Fernandez F.R., Fiestras G., Garcia- Jurado I., Puerto J. (2002) Competition 
and Cooperation in non-centralized linear production games. Prepublicaciones 
de la Facultad de Matematicas, July-01. Universidad de Sevilla 

2. Owen G (1975) The core of linear production games. Mathematical Program- 
ming 9, 358-370. 

3. Owen G (1995) Game Theory. Academic Press. 



Simulation eines C02-Zertifikatenhandels und 
algorithmische Optimierung von Investitionen 



Silja Meyer-Nieberg and Stefan Pickl 

Department of Mathematics, University of Cologne, Center for Applied Computer 
Science, Weyertal 80, 50931 Cologne, Germany 



Abstract. Mithilfe der am Zentrum fur Angewandte Informatik entwickelten Soft- 
ware TEMPI (Technology Emissions Means Process Identification) steht eine Mod- 
ellierungsumgebung zur Verfiigung, die es ermoglicht, finanzielle Investitionen und 
ihren Einfiuss auf CO 2 - Minderungsmassnahmen mithilfe eines zeitdiskreten Mod- 
ells vergleichend gegeniiberzustellen und zu optimieren. 



1 Simulation eines internationalen CO 2 - 
Zertifikatenhandels und Optimierung von 
Investitionen 

TEMPI basiert auf den Grundgleichungen des TEM-Modells, die diese Wech- 
selwirkung mathematisch beschreiben. Wesentlich ist, dass das TEM-Modell 
nur auf Grossen beruht, die empirisch erfassbar sind. Fur eine ausfiihrliche 
Beschreibung sei auf PICKL (1999) und PICKL (2002) verwiesen. 

2 Umweltlizenzen - Zertifikatenhandel 

Seit mehr als 30 Jahren werden innerhalb der Umweltokonomie Umwelt- 
lizenzen als Alternative zu existierenden ordnungsrechtlichen Verfahren ange- 
sehen. In dem Kyoto Protokoll werden handelbare Umweltrechte sogar ex- 
plizit als marktwirtschaftliche Instrumente empfohlen. Damit kann fiber die 
okonomischen Steuerungsanreize ein effizienter Beitrag zum Umweltschutz 
geleistet werden. Ffir die Entwicklung und Etablierung eines solchen Umwelt- 
lizenzsystems werden in Pickl (2002) die folgenden vier Phasen als notwendig 
angesehen: 

1. Festlegung einer zulassigen Gesamtemissionsmenge 

2. Verteilung der Lizenzen, z.B. durch ein ^rand/at/iehn^ Verfahren 

3. Vorkehrungen ffir einen funktionierenden Handel 

4. Entwicklung von Kontrollmechanismen 

Desweiteren kann die Reduktion von (702-Emissionen als exemplarisch 
ffir die Etablierung eines solchen Umweltlizenzsy stems angesehen werden. In 
diesem Zusammenhang kommt der qualitativen und quantitativen Bewertung 
von verschiedenen Technologiepfaden eine zentrale Bedeutung zu. 




472 



2.1 Leakage-EfFekte und Portfolio-Optimierung 

Da weltweit bisher nur geringe Projekterfahrung mit diesem geplanten Zerti- 
fikatenhandel vorliegt, ist man daher besonders auf eine Simulation von Hand- 
lungsszenarien angewiesen. Nach Oberthiir (2000), miisste daruher hinaus 
gekldrt werden, wie sich die Emissionen ohne das Projekt entwickelt hdtten 
(Beriicksichtigung sogenannter Leakage- Effekte), um ein erfolgreiches Gelin- 
gen zu garantieren. 

Die Simulationseinheit TEMPI bietet nun die Moglichkeit, verschiedene 
Optionen mithilfe eines zeitdiskreten Modells vergleichend gegenuberzustellen. 
Dadurch kann es den Ausgestaltungsprozess fiir einen weltweiten C02-Zerti- 
fikatenhandel bei den folgenden Punkten vorbereiten und begleiten: 

• Simulation von verschiedenen Anfangskontingenten 

• Variation von Handelsrestriktionen 

• Analyse von Diskontierungsmassnahmen 

Desweiteren wird dadurch eine spatere rationale Entscheidungsfindung der 
einzelnen Akteure in der Portfolio-Optimierung beim zukiinftigen Kraftwerks- 
bau unterstiitzt. Da die aktuellen (702-Daten eine grosse Varianz aufweisen, 
werden in einer Pilotphase nur die im Clearing-House registrierten Daten 
beriicksichtigt und diese in normalisierter Form verwandt. In einer zweiten 
Phase sollen die dann zur Verfiigung stehenden realen Daten analysiert und 
in das Simulationsmodell aufgenommen werden. 



2.2 Devisenspekulationsprobleme 

Die Modellierung orientiert sich dabei an bestehenden Modellen zu Devisen- 
spekulationsproblemen (DSP) Diese werden haufig als Maximalflussprobleme 
mit Flussmultiplikatoren im Rahmen der kombinatorischen Optimierung simu- 
liert. Spezielle Netzwerk-Simplex Algorithmen konnen zu ihrer Losung einge- 
setzt werden. Geht es bei DSPs jedoch ausschliesslich um eine Maximierung 
des betrachteten Geldwerts, so muss bei einem internationalen Zertifikaten- 
handel bei der optimalen Investitionsplanung das Vorhandensein von zulas- 
sigen Zertifikaten zusatzlich betrachtet werden. Der Einsatz von Investition- 
smitteln bzw. der begleitende An- und Verkauf von Zertifikaten wird durch 
rationale Entscheidungen beeinflusst, die mit spieltheoretischen Methoden 
entsprechend beriicksichtigt werden sollen. Insbesondere wird der Core als 
mogliche Allokation herangezogen, um ein gemeinsames kooperatives Vorge- 
hen zu beschreiben. 



2.3 Algorithmische Bestimmung von Core-Elementen 

Im folgenden soil ein naheliegendes Verfahren beschrieben werden, das es 
ermoglicht. Core Elemente iiber eine geeignete Dualisierung zu bestimmen. 
Der Algorithmus bietet sich an, im Rahmen einer experimental economics 





473 



umgesetzt zu werden. Hierzu beginnen wir wie folgt: 

• Ordne die Teilmengen der Spielermenge M , die mindestens 2 Elemente 
besitzen, auf die folgende Art und Weise 

Bilde die Sequenz A'l, . . . , 7^2” -n-i d.d. 

\Ki\ < \Ki^i\ fiir z == 1, . . . , 2^ - n - 2 gilt. 

Dies impliziert AT 2 ^-n-i = 

• Definiere nun die (2^ - n - 1) x n-Matrix A = 
durch 

_ J 0, falls k^Ki 

~ \ 1, falls keKi 

• und einen {2'^ - n - 1)- Vektor b = durch 

bi = v{Ki) fiir i = 1, . . . , 2” - n - 1 

Es ist nun bekannt, dass ein Vektor x £ genau dann Element des Core 
ist, wenn die folgenden Bedingungen erfiillt sind: 

• PI Y,aikXk>bi fiir z = 1, . . . , 2^ - n - 2 

k=l 

• P2 Y, ^k= &2"-n-l = v{Af) 

k=l 

• P3 Xk >0 fiir fc = 1, . . . , rz 

n 

Ersetzt man nun PI durch aikXk-^Xn-j-i >bi fiir z = 1, . . . , 2^-n-2 

fc=i 

und P3 durch Xk >0 fiir fc = 1, . . . , n + 1 dann erhalt man: 

2”-n-l 

• Maximiere Y 

i=l 

2^-n-l 

• D1 Y o,ikVi<0 fiir = 

i=l 

• D2 Y Vi Duales Problem 

• D3 z/i > 0 fiir z = 1, . . . , 

References 

1. OBERTHUHR, S.; OTT, H.E. (2000): Das Kyoto Protokoll - International Cli- 
mate Policy for the 21st Century (International and European Environmental 
Policy Series)^ Springer Verlag Berlin-Heidelberg. 

2. PICKL, S. (1999) Der r-value als Kontrollparameter. Modellierung und Analyse 
eines Joint- Implementation Programmes mithilfe der kooperativen dynamischen 
Spieltheorie und der diskreten Optimierung^ Shaker Verlag Aachen. 

3. PICKL, S. (2002) Process Identification and Technical Investments with TEM- 
PII - Bubbles, Quelros and Environmental Management. Operations Research 
Proceedings 2001 (Selected Papers), Springer Verlag Berlin, Heidelberg. 





Indirect Expenditure Functions and 
Shephard’s Lemma 



S. Fuchs-Seliger^ 

Universitat Karlsruhe (TH), Kaiserstr. 12, D-76128 Karlsruhe, Deutschland 



Abstract. Based on preferences on the normalized price space, an indirect expen- 
diture function will be defined. We will study the properties of the inverse demand 
function and of the indirect expenditure function following from hypotheses on nor- 
malized prices. It will also be shown that Shephard’s lemma holds without assuming 
transitivity and completeness of the underlying preference relation or differentia- 
bility of the indirect expenditure function. 

Journal of Economic Literature Classification: Dll, D69 



1 Introduction 

In this article we will investigate consumer behavior when her or his prefer- 
ences on the price space IR" 4 . are known. Every p G will be interpreted 
as an income-normalized price vector. It is appropriate to assume that, if 
normalized prices increase such that we have p > p', then the well-being of 
the individuals in price situation p is worse than in price situation p'. 

Imposing weak conditions on the preference relation on the price space, an 
indirect expenditure function and an inverse compensated demand function 
will be deduced and their properties will be studied. An inverse compensated 
demand function assigns, to every commodity bundle x, a price vector which 
minimizes the expenditure for x and makes the individual as well-off as in a 
reference price situation p^. 

Finally, we will be concerned with Shephard’s Lemma which is an impor- 
tant tool in consumer theory as well as in producer theory. It will be shown 
that Shephard’s lemma holds without imposing hypotheses on the differen- 
tiability of the indirect expenditure function. 

2 Properties of the Price Space and Preliminary 
Results 

In the following we will consider an individual having certain preferences with 
respect to the prices of goods in an economy. If a good becomes cheaper, then 
the individual’s well-being rises. We will base our analysis on the following 
weak hypotheses (for a suitable interpretation assume that the individual has 
one unit of income): 




475 



(PI) i? is a relation on the strictly positive n-dimensional real space 
Every p G will be interpreted as an income-normalized 
P 

price vector. Therefore, P ~ where P are the market prices and 

M the income of the individual. 

(P2) R is reflexive. 

(P3) R is lower semicontinuous, i.e. {p [ p^ Rp} is closed in lR+_^ for 
every p^ E 

(P4) R is decreasing, i.e. for all p^,p^ € p^ < p^ => 
where p^Pp^ means, p^Pp^ A -i(p^Pp^). 

We can also interpret p^Rp^ as p^ is at least as good as p^ in the eyes of the 
individual. The above hypotheses are appropriate for a preference relation on 
the price space. Apparently, the individual will prefer lower prices to higher 
prices when income does not change but is a fixed amount of money. 

We are going to define a function 

C : X -> 1R+ by C{x,p) = min {p'x | pPp'} for x e 1R+_|. and 

p G under the supposition that the minimum exists. 

C(., .) will be considered as an indirect expenditure function. 

Preliminarily, one can show 

Lemma 1. Under (PI), (P2) and (P4) 

C^{x,p) = inf {p'x I pPp'} 

is well-defined, for all x G IR!J:, and C^{x,p) > 0 for all (x,p) G IR^^. 

In order to guarantee the existence of min \p'x I p Pp') one can assume 
the further condition: 

(P5) For every sequence <p^>, p^ G lRi_L, such that lim p^ = p^ 0 

^ k^oo 

it follows: 

For every p G there exists a positive integer N such that p^Pp 
for all k > N. ^ 

Note that the indirect utility function v{p) = which belongs 

Pi *P2 

to the utility function u{x) — x\ • X 2 satisfies (P5). 

It should be stressed that the following analysis can be also done for 
inf{p'x |pPp'} instead of min{p'x |pPp'} if we assume (PI) to (P4) only. 
Assuming hypothesis (P5) instead of (P4) the preference field of R may posses 
’’thick” indifference surfaces. This is excluded if we require R to be decreasing. 

Based on the hypotheses (PI), (P2), (P3) and at least on one of the 
hypotheses (P4) or (P5) we will develop a model of consumer behavior when 
a preference relation on income-normalized prices are given. 

^ p^ < means p] < p? for all i ^ n 





476 



Lemma 2. Under (PI), (P2), (P3) and (P5) the indirect expenditure func- 
tion 

C{x,p) = ^ mn {p'x I pRp'} for (x,p) € 1R+”,. 
is well-defined. 

We will define a correspondence 5 by 

8{x,p) = arg min {p'x | pRp'} , V {x,p) € . 

Note that 5{x,p) can be many- valued. 

5 will be called a ’’compensated price correspondence” or an ” inverse 
compensated demand correspondence”. By interpretation, assigns, to 

every commodity bundle x, all those price situations p' which minimize the 
expenditure for commodity bundle x and makes the individual feel not better 
than he or she felt in price situation p. Denote by I the symmetric part of 
jR, i.e. xly xRy A yRx, then I means indifference. 

Lemma 3. If (PI) to (P5) is assumed, and if additionally R is complete ^ 
and upper semicontinuous ^ , then 

5{x,p)=arg min {p'x\p'Ip}. 

P + 

Proof: By contradiction, suppose pPp^ for some p^ G S{x,p) then, in view 
of the upper semicontinuity, the completeness and the decreasingness of R, 
there would exist a p in an 6:-neighborhood of p^ such that p < p^ and p P p. 
Since then px < p^x we obtain a contradiction to the definition of 6 {x,p). 

The following theorems base heavily on the concavity of C( • ,p)^ which 
can be easily demonstrated. 



Lemma 4. Under the conditions (PI) to (P3) and (P5) 

a) - ,p) is concave on 

b) C( • ,p) is continuous on 

Remark 1. Concavity of C(-,p) plays an important role for the differentiabil- 
ity of C(*,p). Since C{',p) is concave, the function (-)C(-,p) is convex. There- 
fore, we can apply the results on convex functions. Especially, we obtain that 
the subdifferential (or more precisely superdifferential) dC{x,p) of C{x,p) for 
every x G is non-empty (see Rockafellar [10], Theorem 23.4). We also 

^ R C X X X is complete if for all x, y G X, xRy V yRx holds. 

^ RC X X X is upper semicontinuous, if 1 zRz^} is closed in X for all z^ ^ X. 
^ a function / : IR” -> IR is called concave, if for all x, x' G IR^, x ^ x' : 

/(Ax + (1-A)x')^ A/(x) + (l-A)/(x') , V AG [0,1], 





All 



have that if C(-,p) has a unique subgradient (supergradient) at x, then C(*,p) 

dC{x,p) 



5 

OXi 



dXn 



is the unique subgradient, and thus C{z,p) ^ C(x,p) + VC(x,p) • {z - x), 
Vz G (see Rockafellar [10], Theorem 25.1. p. 242). 



3 On Shephard’s Lemma 

It is well-known that Shephard’s lemma is an important tool in both consumer 
theory and production theory. In our context Shephard’s lemma means, that 
the partial differentiation of the indirect expenditure function C(x,p^) with 
respect to the z-th good Xi for every i ^ n, yields a price vector S{x^p^) which 
minimizes the expenditure for x and makes the individual as well-off as in 
the price situation p^. We will deduce the differentiability of (7(-,p) directly 
without using the concept of subgradients. 

Theorem 1. Assume (PI), (P2), (P3) and (P5), and additionally letS{x,p) 
be single-valued. Then 

a) ^( * 'Is continuous on for every p^ G H+_|_ , 

b) Shephard’s Lemma: = ^ii^^p) ? Vx G 1R+4-- 

OXi 

Proof for a): 

Let us consider a price vector p^ G 1R++ and a sequence <x^>C 1R+_|_ such 
that lim x^ = x^ € 5^++* The definition of 6 implies 0 < x^5(x^,p^) < x^p^. 

k-i-oo ^ ~ 

Hence, there exists a bounded subsequence of <6{x^,p^)>. Without loss of 
generality let this be <S{x^,p^)> itself. Hence, there exists p* G lR+-j- such 
that lim S{x^,p^) = p*. Since p^RS{x^,p^), for all k, there does not exist a 

k—¥oo 

positive integer N such that for alH ^ AT, 5{x\p^) Pp^. This together with 
(P5) implies p* > 0. Using the lower semicontinuity of R, we obtain p^Rp*. 
Prom the continuity of C{',p^) it follows, p^x^ = lim 6{x^ ,p^) • lim x^ — 

k—^oo k—>oo 

lim {x^ • 5(a:*,p^)) = lim C{x^,p^) = C{x^,p^). Since by assumption (5 (a:, p^) 

k—^oo k—^oo 

issingle-valued, wemusthavep* = 5{x^,p^) and thus S{x^,p^) = lim S{x^,p^). 

fc— yoo 

Proof for b): 

Let us consider p^ G and the commodity vectors x and x -h Z\x, and 
write AC{x,p^) = C{x -f Ax^p^) — C{x,p^). Hence, 

AC{x,p^) = {x A- Ax) • 5(x -h Ax,p^) - x • 6{x,p^) (1) 

= X • 5{x -h Ax,p^) - X • S{x,p^) -H Ax • 5(x -h Ax^p^) . 

The definition of S yields p^ R 6{x A- Ax, p^) and p^ R 6 {x,p^), and thus 
we have x • S(x -f Ax,p^) - x • S{x,p^) ^ 0. In view of (1) we therefore obtain 





478 



AC{x,p^) ^ Ax ■ S{x + Ax,p^). For any y ^ n consider = (0, . . . , 0, Axj, 
0, ... ,0). Then we obtain AC{x,p^) ^ Axj • 5j(i + Ax,p°). If Axj > 0, then 

4^ f X ') 

^ ^ ^Sj{x + Ax,p^). If Axj < 0, then we obtain the con- 



we have 



Axj 



verse. Since the concavity of C{ • implies that C( • ,p^) is right- and left- 
hand differentiable (see Fenchel [4], pp. 71, 79) and since 5( • ,p®) is continuous 
on IR!! 



lim 

Axj >0 
Axx-*0 



'+ + 

AC(xy) 






AXj ~ Axj ->0 



Axj<0 
Axj — >0 



Moreover, in view of the concavity of (7( • ,jp) we have 



( 2 ) 



lim lim 



Axj<0 
Ax j —hO 



Axj 



Axj >0 
Ax ^ —^0 



Ax 4 



Prom this together with (2) the assertion follows. □ 

Recalling the remark of the second section we immediately obtain, that 
(7(-,p) is differentiable at x for all x G lR+_^ and 

C{z,p) ^ C{x,p) -h {Si{x,p), . . . , Sn{x,p)) {z-x),y ze 1R++. 

Without requiring single- valuedness of S{x,p) one can also show that the 
conditions (PI), (P2), (P3) and (P5) imply that (5(-,p°) is a compact-valued 
and upper hemicontinuous correspondence. 

In order to deduce Shephard’s Lemma we assumed that ^(*, p^) is single- 
valued. We will now establish the single- valuedness of (S(-, ,p°) requiring, 
additionally, that R is complete, transitive, continuous and strictly concave. 
By the latter we mean the following: 

R is strictly concave on IR++ iff for all p^,p^ G IR++ '• 

p^Rpi A pVp" p2p(Api + (l-A)p2), VAg]0,1[. 

Theorem 2. Assume (PI) to (P5) and additionally let R he complete, tran- 
sitive, continuous and strictly concave. 

Then 6{x,p^) is single-valued for every {x,p^) G 1R^”;|_. 



Proof: 

Consider p® G lR!f.^_ . Then under the above conditions C{x,p^) is well- 
defined for every x G . Suppose that there exist at least two different 
price vectors p^,p^ G 5{x,p^) . Hence p°i?p^ andp^Pp^. Since R is complete, 
without loss of generality let us suppose that p^Rp^. Then strict concavity 
of R implies p^P (Ap^ + (1 - A)p^), V A G]0, 1[. In view of transitivity and 

completeness of R, p^P (Ap^ + (1 — A)p^), V A g] 0, 1[. Consider A = Then 





479 



1 2 * 4 “ tP' 

pOp 2 „±p p — — _ — Since R is continuous and decreasing, there 

exists p G ni+ 4 - such that p < p and p^Pp. From this follows px < px. This 
together with p^Pp contradicts the definition of C{x^p^). □ 



Remark 2. Comprising the results of this section, it was shown that under 
the hypotheses of Theorem 2 there exists an inverse compensated demand 
function continuous with respect to x, and the partial derivatives 

of the indirect expenditure function C{x,p^) with respect to Xi are equal 
to the components of the inverse compensated demand function. Therefore, 

( dC{x,p^) dC{x^p^)\ . 1 . j. 

V — ’ * * ' ’ — ^ price system which makes the indi- 

vidual as well-off as in the price situation p^ and minimizes the expenditure 
for commodity bundle x. 



References 

1. Blackorby, Ch.; Primont, D. and Russel, R.R.: Duality, Separability, and 
Functional Structure: Theory and Economic Applications. New- York : North- 
Holland, 1978. 

2. Diewert, W.E.: Duality Approaches to Microeconomic Theory in K.J. Arrow 
and M.D. Intriligator: ’’Handbook of Mathematical Economics”, Amsterdam : 
North-Holland, 1982. 

3. F^e, R.; Primont, D.: Multi-Output Production and Duality: Theory and Ap- 
plications. Boston : Kluwer Academic Publishers, 1995. 

4. Fenchel, W.: Convex Cones, Sets and Functions. Mimeographed lecture notes, 
Princeton University, 1951 

5. Fuchs- Seliger, S.: Compensated and Direct Demand without Transitive and 
Complete Preferences. Annals of OR 23 (1990), p.p. 199-310. 

6. Fuchs-Seliger, S.: A further Remark on Shephard’s Lemma. Economics Letters 
56 (1997), p.p. 359-365. 

7. Fuchs-Seliger, S.: A Note on Duality in Consumer Theory. Economic Theory 
13 (1999), p.p. 239-246. 

8. Hanoch, G.: Symmetric Duality and Polar Production Functions, in: M. Puss 
and D. McFadden (eds.), ’’Production Economics: A Dual Approach to Theory 
and Applications”, vol. 1, Amsterdam : North-Holland, p.p. 111-131. 

9. Maxtinez-Legaz, J. E., Santos, M. S.: On Expenditure Functions, Journal of 
Mathematical Economics 25 (1996), p.p. 143-163. 

10. Rockafellar, R. T.: Convex Analysis. Princeton, New Jersey, Princeton Univer- 
sity Press, 1970. 

11. Shephard, R.W.: Cost and Production Functions. Repr. of the 1. ed., Berlin : 
Springer, 1981. 

12. Shephard, R.W.: Indirect Production Functions. Meisenheim am Gian : Anton 
Hain Verlag, 1974. 





Bayesian Estimation of the Heston Stochastic 
Volatility Model 



Sylvia Priihwirth-Schnatter^ and Leopold Sogner^ 

^ Email: Sylvia.Pruehwirth-Schnatter@wu-wien.ac.at, Vienna University of 
Economics and B.A., Augasse 2-6, A-1090 Vienna, Austria 
^ Email: soegner@ibab.tuwien.ac.at, Vienna University of Technology, 
Theresianumgasse 27, A- 1040 Vienna, Austria 



Abstract. The goal of this article is an exact Bayesian analysis of the Heston 
(1993) stochastic volatility model, where different parameterizations of the latent 
volatility process and the parameters of the volatility process will be used to improve 
convergence and the mixing behavior of the sampler. We apply the sampler to 
simulated data and to DM/USS exchange rate data. 



1 The Heston model 

In this paper we investigate the following model, based on [2]: 

dx'*'{t) = (/i -f Pa{t)^)dt H- a(t)dWQ{t) , (1) 

where actual volatility is given by a superposition of k independent square 
root processes: 

i=l 

= Xi{ai - ai{t))dt -f Tiai{t)dWi{t) , (2) 

where i = 1, . . . , A:; is often interpreted as the speed of mean reversion, 
Qi is the mean of the instantaneous volatility process and is the volatility 
parameter; Wj{t),j = 0, 1, . . . , fc, are independent Brownian motions. 

For the underlying stochastic volatility model, the parameters /i, ^8, a, 
A, and r, where a = {ai,. , , ,ak), etc. are unknown. These parameters - 
abbreviated by 0 - have to be inferred from the data available. For financial 
time series these are (equidistant) observations of asset yields (yn)^ where the 
step- width will be denoted by A in the further analysis. The index n is used 
to abbreviate the points of time . . . , (n — l)A^nA , . . ., i.e. if N yields are 
observed, then Y = {yn)n=i within the time span [A^T], where T = NA. In 
this paper we abbreviate increases in Brownian motion by ojj{s, t) = Wj{t) — 
Wj{s), j — 0, 1, ... , k. If a constant step width A = t — s is used, then simply 
Uj^n{A) — Wj{nA) - Wj{{n - l)zi) will be used. Note that Ljj^n{A) has the 
same distribution as y/Aojj^n{^)^ 




481 



The goal of the following paragraphs is the specification of the conditional 
distributions, such that an exact Bayesian analysis can be performed. Let us 
define integrated volatility: 







■= f cr^{u)dv 

Jo 



(3) 



Conditionally on the increases in integrated volatility hn from t = {n — 1)A 
to t = nA^ the asset returns are normally distributed, i.e. 



Vn ^ hfi) 



(4) 



The conditional distribution of the increases in integrated volatility fulfills 
7r(hnl/in-i,c^n-i5^) = ^i) ’ Although the conditional den- 

sity 7r{hi^n\(^i^n-i'^^i) cannot be derived analytically, the Fourier transform is 
available as described in [4]. The conditional distribution of the instantaneous 
volatility process 7r(cr^|cr^_i; 0) is a non-central distribution. Last but not 
least a stationary law of (o-f (t)) exists if the parameters satisfy: Ai > 0 and 
2\iai/rf > 1. The stationary law is a Gamma distribution. For bi := 2Xi/rf 
and ai := biai, the marginal densities of the instantaneous volatilities are 
given by: 

= 7^ expi-bmit)^) . (5) 

Remark 1. Despite the fact that only the conditional distributions are re- 
quired to apply MCMC, the restriction 2\iailrf > 1 - implying stationary 
instantaneous volatility processes with marginal gamma law - is plausible 
from an economic point of view. If 2Xiailrf > 1 is met, the asset yields will 
remain stochastic since cr‘f{t) > 0 with probability one. 



2 Bayesian estimation of the Heston model 

Parameter estimation for stochastic volatility models is known to be a diffi- 
cult problem (see e.g. [8]). The problem stems from the fact that the condi- 
tional distribution /(yn|hn, yS) of the observed returns yn depends on the 
unobservable integrated volatility To apply the method of maximum 

likelihood, where 



N 

= g{yi,---,yN\6) = Y[f{yn\yi,---,yn-i,0) , ( 6 ) 

n=l 

the closed forms of the conditionals g{yn\-) are required. To derive 
we have to invert the characteristic function. To derive g{yn\-)^ the integrated 
volatilities have to be integrated out in (4). Since this integration is practically 
impossible, the application of maximum likelihood becomes infeasible. 





482 



2.1 Estimation of the parameters of the volatility processes 

To sample from the joint posterior distribution 

7t(X, e\Y) oc f{Y\X, e)7r{X\0)7r{e) , (7) 

the “complete data” likelihood f(Y\Xj 6) as well as the “prior density” 7Ji{X\6) 
of the joint distribution of the latent variables X under 6 has to be known 
explicitly. 7t{6) is the prior of 6. The “complete data” likelihood f{Y\X^6) is 
easily obtained from (4) as the product of N densities from a normal distri- 
bution, 

N 

f{Y\X, 0) = ]][ fiVnlfiA + phn, K) . (8) 

n=l 

W.l.g. we restrict to fc = 1 for notational simplicity in this subsection. 
The natural candidates to describe the latent process are the integrated 
volatilities Xf := • • • ? ^n,C 5 • • • , hN,c) and the instantaneous volatili- 
ties X^ = • • • 5 ^n,C’ • • • 5 This version to express Xi and 

X 2 will be called to centered version, where hn = id{hn^c)^ etc. Following 
[5] and [6], another version - called the non-centered version X^^^X^^ - 
uses a starting value aQ and increases in Brownian motion cjn(l )5 to calcu- 
late (hn) and (a^) by means of the model parameters 6. When simulating 
(^n) by means of the Euler scheme, the map o;n(l) i-> exp(-Azi) + (1- 
exp(-Azi))aH-rt7n-i(4i)^’^ct;n(l) =: where a;n(l) A/"(0, 1), provides 

us with an (approximation of the) instantaneous volatility process. Since this 
map is one-to-one it is indeed possible to derive the process (6Jn(l)) from the 
instantaneous volatilities (cr^). An integration of cr^ over the time interval A 
provides us with hn,NC = otA + (or^_i - a) + r(Jn-i(4\)^'^a;n(l)* 

The reader should note that we use and a)n(l) in the above expres- 

sion. If A approaches 0, then the equality o;n(l) = ^^n(l) has to hold. 
However, for observations on a discrete grid, hn remains a random variable 
given To account for this fact we used a;n(l) and o)n(l)- In the non- 

centered version the yields and (hn) depend on the parameters 0. This is 
not the case when the centered version is applied. Following [7], there ex- 
ists the opportunity to construct a partially non-centered parameterization. 
This can be obtained by a convex combination of centered and non-centered 
terms, i.e. X[^ = (1 “ and X^^ (1 ^ ^ 

hn = {1- l^)hn,C + l^hn,NC, etC. 

After we have discussed how volatilities can be represented in our anal- 
ysis, let us investigate the ’’prior” 7t{X\9), where X = {X(,X^) with j G 
{(7, iVC, PC}. From the above definitions 7t{X\0) is nothing more than the 
product of N conditional densities 7r{hn, cr^^lkn-i, 0) , where hn and 
are conditionally independent, i.e. 

n{hn,crl\hn-i,(jl^i-,0) = n{hn\crl_i;e)Tr{al\(rl_i;e) . 



(9) 




483 



Prom section 1 we already known that the first density in (9) can be 
reconstructed by means of Fourier inversion while the second density is given 
by a non-central Isiw. 



2.2 Priors 

In the ongoing analysis we put the following priors on the model parameters: 
/i Mo), P ^ Af{bo,Bo) and ^ Af{ao,Ao). For Xi and n we 

use a gamma prior ^(/q, Lq) and G{do^Do). M stand for the normal law, Q for 
the gamma distribution, and U for the uniform distribution. The parameters 
fi and P will not estimated and set to zero, i.e. mo = &o = 0 and Mo = Bo =0. 
We use /o,i = 1, Lo,i = 1/50, do,i = 0.01 and Do,i = 1/5. For hi^n.c, 
hi,n,NC and ai^n,NC we construct the priors by geometric averaging. I.e. 
7t(/i„) = Truihi,n,cY~''' T^uiKn, N cY*, Tr(cr^) = Trui<rln^cy~''‘Ttu{<yln,NCy' ■ 
For 7r^i(.) we use a uniform distribution on [0,5], 5 is sufficiently large. 

For Yli=i ^se an informative prior. The importance of using an 

informative prior with the mean volatility parameters can be motivated by 
the fact that (i) ^Quals to the expected sum of instantaneous volatilities 
and (ii) also the properties of a maximum likelihood estimation (with a fixed 
process (a^) and A: = 1) improve if a is a-priori fixed at the sample mean. 
To incorporate this ’prior’ knowledge into a Bayesian model, we use strong 
priors on more precisely we set ao = y'ylTjA and Aq = 0.1 ao. 



2.3 MCMC estimation 

The Subsections 2.1 and 2.2 have already described the conditional distri- 
butions and the priors of the underlying stochastic volatility model. In this 
paper we restrict ourselves to the estimation of {hi^n^ parame- 

ters o, A and r. P and /i are fixed at 0. We derive samples of the parameters 
X,6 as follows: 

Step 1: hn from 7r{hn\Y, ^ 

Step 2: al from n(al\Y,Xi,alj^„,0) 

Step 3: a from 7r(a\Y, Xi, X 2 , X,r, u) 

Step 4: A from 7t{X\Y, Xi , X 2 ^ X,r, u) 

Step 5: r from 7r(r|y,Xi,X2, A,r, i/) 



3 Results 

First we tested our sampler in simulated data. For the one factor setting 
{k = 1), we generated paths with N = 400, a = 0.0012, A = 50, r = 0.5, P = 0 
and /i = 0. Throughout this paper, we work with a step-width A = 1/252. 
For the two factor setting {k = 2), we set a = (0.006 0.007)', A = (50 150)', 





484 



r = (0.5 1)', ^ = 0 and // = 0. The parameters are updated by means 
of the Metropolis-Hastings (MH) algorithm. In the MH algorithm we use 
smoother densities described in [3] to update 

the parameters of the latent process X = with j € {C^NC^PC}. 

In all update steps random-walk proposals are used. 

The most important point of this analysis is the parameterization of the 
latent process and blocking. For a centered parameterization we observe that 
A and r are positively correlated, their posterior modes are much too large. 
This can neither be improved by blocking steps 3 and 4 nor by blocking 
steps 3-5. For the non-centered par ameterizat ions the acceptance rates in the 
updates of 6 are very small and the posterior distributions of A, r have modes 
much smaller than the true parameter values. This motivated us to construct 
the partially centered parameterization. With a mean of the hyper-parameter 
Ui = 0.05 the partially centered version results in good acceptance rates, and 
the modes or means of the estimated posterior of the parameters are close 
to the true parameter values. This can be observed for both the one and the 
two factor setting. 

After we have tuned our sampler with simulated data, we applied this al- 
gorithm to daily DM/USS exchange rate data from January 1998 to December 
1999, where N = 504. Note that the squared variation y'y/T was equal to 
3.4224e - 5, such that a posterior mean for Yl- of y'ylTjA = 0.0086 can 
be expected. For the empirical data a small I'i, I'i = 0.025, was necessary 
to get considerable acceptance rates. We observe good mixing for the one 
factor model, while the mixing for the first factor (Ai small) in the two factor 
setup has to be improved by further tuning. We claim that this can be done 
by tuning Table 1 presents the mean and the standard deviation of the 
posterior for the corresponding model parameters for the one factor and the 
two factor setting. Comparing our results to the estimates of [1] we observe 
that our one factor estimates are very close to the estimates presented in [1] 
while for the two factor model, the both persistence parameter Aj are larger 
than in [1]. 



Table 1. DM/US$ exchange rates form January 1998 to December 1999, N = 504. 
Posterior means, standard deviations and parentheses. 10000 simulation steps, 1000 
burn-in steps. 



a 


fc = 1 
A 


^ 1 


1 «1 


^1 


k 

•^1 


= 2 

OC2 


-^2 


■^2 


0.0089 

(5e-4) 


61.2167 

(11.511) 


11.5113 

(0.413) 


0.0044 

(3e-4) 


43.7663 

(4.475) 


0.3094 

(0.011) 


0.0040 

(3e-4) 


203.1584 

(23.928) 


1.0441 

(0.052) 





485 



4 Conclusions 

This article has implemented an exact Bayesian analysis of the Heston (1993) 
stochastic volatility model. We observe that different parameterizations of the 
latent volatility process and the parameters of the volatility process result in 
very different convergence behavior of the MCMC sampler; the best per- 
formance has been observed with a partially centered version of the latent 
process. We apply the sampler to simulated data and to DM/US$ exchange 
rate data. 

References 

1. Bollerslev, T., H. Zhou (2002) Estimating stochastic volatility diffusion using 
conditional moments of integrated volatility. Journal of Econometrics, 109 , 
33-65. 

2. Heston, S.L. (1993) A closed-form solution for options with stochastic volatility 
with applications to bond and currency options. Review of Financial Studies, 6, 
327-343. 

3. Jong, P., N. Sheppard (1995) The simulation smoother for time series models. 
Biometrika, 82, 339-350. 

4. Lamberton, D., B. Lapeyre (1996) Introduction to Stochastic Calculus - Applied 
to Finance. Chapman Hall, London, 1®^ edition. 

5. Papaspiliopoulos, O., P. Dellaportas, G. Roberts (2001). Bayesian inference 
for non-gaussian ornstein-uhlenbeck stochastic volatility processes, mimio, Lan- 
caster. 

6. Roberts, G., and O. Stramer (2001). On inference for partially observed nonlinear 
diffusion models using the Metropolis-Hastings algorithm. Biometrika, 88 , 603- 
621. 

7. Papaspiliopoulos, O., G. Roberts, and M. Skold (2002). Non-centered parame- 
terisations for hierarchical models and data augmentation, mimeo, Lancaster. 

8. Shephard, N. (1996) Statistical aspects of arch and stochastic volatility. In 
Cox,D., D. Binkley and 0. Barndorff- Nielsen (Eds.): Time Series Models in 
econometrics, finance and other fields. Chapman & Hall, London. 





Forecasting with Leading Economic Indicators - 
A Neural Network Approach 



Timotej Jagric 

Department for quantitative economic analysis, Faculty of Economics and Busi- 
ness, University of Maribor, Razlagova 14, 2000 Maribor, Slovenia, Tel: +386 2 
22 90 343 Fax.: +386 2 25 16 141, E-mail: timotej.jagric@uni-mb.si 



1 Introduction 

There is variety of important issues associated with the problem of business cycle 
forecasting, especially regarding forecast methodology and forecast evaluation. 
Overall, we can say that macroeconomic forecasting has a fairly poor reputation 
(Granger 1996). Still, even with the recognition that forecasting business cycles is 
a very difficult task, we find some hopeful signs for future progress. Our research 
on forecasting has focused on development of new approach in forecasting with 
classical NBER leading indicators by applying neural networks. 

The decision to focus on the neural networks arises directly from the features of 
these models, as described by Bishop (1995). First, neural networks are data- 
driven and can “learn” jfrom, and adapt to, underl)dng relationships. This is useful 
in contexts where one does not have any a priori beliefs about functional forms. 
Second, when properly specified they are universal functional approximates. Fi- 
nally, neural networks are non-linear, which seems to be a case for many macro- 
economic time-series. 

The paper is organized as follows. In section two the structure of the database 
and the selection of reference series are explained. The selected input variables of 
the model are presented in section three. The suggested model is explained in sec- 
tion four. In section five of the paper we present the results. 



2 Data 

Important step in the process of construction the model is development of a broad 
database, which should cover all crucial fields of economic activity. The database, 
which was used in the model, includes 365 time series. To ensure sufficient trans- 
parency, the time series are classified in categories. The database can be divided in 
two major groups of time series: time series representing Slovene economic activ- 
ity, time series representing foreign economic activity. Since Slovenia got inde- 
pendence in October 1991, the time series start with January 1992. The database 




487 



covers the period 1992:01 - 2001:08. In the final version of the database, all time 
series were transformed into growth rates. 

The aim of the model is to forecast a selected reference variable, which is the 
benchmark that indicates fluctuations in the economic activity. We selected 
monthly index of total industrial production. Extensive analysis (Jagric 2002) of 
such reference variable also gave support to our decision, since it was discovered, 
that industrial production has same cyclical characteristics as GDP in Slovenia. 



3 Scoring System for Business Cycle Indicators 

To construct a forecasting model, a selection of input variables is needed. In our 
study we extended the use of criteria employed by NBER. The scoring of each se- 
ries reflects our desire not only to make as explicit as possible the criteria for se- 
lecting indicators but also to increase the amount of information available to the 
user in order to aid in evaluating their current behavior. The scoring plan includes 
five major elements: economic significance, statistical adequacy, promptness of 
publication, smoothness, conformity and timing. When the subheads under these 
elements are counted, eight different properties of series are rated in all. 

We decided to score all time-series. The total score of time-series, theoretical 
lead-time, and the results of graphical analysis, were than used to form the group 
of leading indicators, which includes 58 time series from database. The average 
lead-time is determined by cross-spectral analysis (phase and coherency) and 
Granger test of causality, where two criteria were used: the value of adjusted de- 
termination coefficient, and Akaike information criteria. The average lead-time is 
only an estimate of actual lead-time for selected time-series. The scoring system, 
we have used, ensured that the selected indicators posses the best characteristics 
among all time-series in database. 



4 Neural Network Model 

The scoring system, which was used, determined the selection of input variables in 
our model. The target variable is the monthly index of industrial production. The 
input and target variables are not seasonally adjusted. As Stock and Watson 
(1998) noted, seasonal adjustment procedure applied to the data may be cleansing 
them of any underlying non-linearities. In our case many selected variables show 
an underlying trend. As there is no universal trend function, which could be ap- 
plied to all variables, we decided to use one-year and monthly growth rates (in 
decimal). These transformations also ensured that the values of variables do not 
differ significantly from each other and are between -1 and 1. 

Another important reason for pre-processing is phenomenon known as the 
‘course of dimensionality’. If we are forced to work with a limited quantity of 
data, as we are in practice, then the dimensionality of the input space can rapidly 





488 



lead to the point where the data is very sparse, in which case it provides a very 
poor representation of the mapping. Therefore our goal is, to map vectors x” in a 
d-dimensional space (X|,...,x^) onto vectors z'* in an M-dimensional space 
(zi,...,Zj^) , where M<d. To achieve this goal, we use unsupervised linear 
transformation technique known as principal component analysis (PCA), where a 
set of data are summarized as a linear combination of an orthonormal set of vec- 
tors. 

More challenging was to determine the design of neural network. Changes to 
the architecture can fundamentally alter the forecasts produced by the network, 
even when no changes are made to the inputs, outputs or sample size. It is impor- 
tant to distinguish between two distinct aspects of the architecture selection prob- 
lem. First, we need a systematic procedure for exploring some space of possible 
architectures, and this forms the subject of this selection. Second, we need some 
way of deciding which of architectures considered should be selected. This is de- 
termined by the requirement of achieving the best possible generalization. 

In our case we first formed some basic requirements for the network. First, we 
will use feed-forward back-propagation network, since this type of network is 
mostly used in forecasting applications. Due to the type of input and output data, 
we will only use two types of transfer function: pure linear and tan-sigmoid 
activation function. Next, we decided that only the output-layer will have a bias. 
The output layer will consist of one neuron, since we try to predict the future value 
of one reference series. Last, the network will not have more than three layers of 
neurons, since such network is a universal approximator. 

The above requirements have greatly reduced the space of possible architec- 
tures. Therefore we could apply a simplified tiling algorithm (Mezard and Nadal 
1989). We build the network in successive layers with each layer having fewer 
units than the previous layer. When a new layer was constructed, a single unit, 
called the master unit is added. Then step-by-step additional units are added. 
Every step the network is trained and the forecasting performance is estimated. 
The whole process is repeated until a larger network does not sufficiently contrib- 
ute to the forecasting performance. 

To train our network, we use quasi-Newton approach, since it has a significant 
advantage over the conjugate gradient method, which is normally used for train- 
ing. One of the problems that occur during neural network training is called over- 
fitting - the network has memorized the training examples, but it has not learned to 
generalize to new situations. One method for improving network generalization is 
to use a network that is just large enough to provide an adequate fit (Hagan et al. 
1996). Since we do not know how large a network should be in our application, 
we selected regularization. It has been found empirically that by using a regular- 
izer one can significantly improve the network generalization (Hinton 1987). 





489 



5 Results 

In the process of network architecture selection, we tested every form for 6, 9, 12, 
and 15 months forecast (all calculations were performed with Matlab v.6.0 R12). 
We developed special program, which automatically supervised the testing and re- 
corded the performance. The performance was measured with different criteria: 
total mean square error for in- and out-of-sample data, determination coefficient 
for regression between target variable and estimated variable, and comparison of a 
spectrum for target and estimated variable. Testing was performed on two com- 
puters. Major testing was performed on IBM Server X220 with Pentium X-III 866 
MHz. The stability of results was tested on the second computer with Pentium III 
500 MHz. Testing required about 600 computing hours on the server since every 
estimation of a possible architecture was repeated max. 5000 times. To avoid idle 
processing time, the developed program controlled this process. After each cycle 
the program reported final results, which were then tested manually on the second 
computer. 

It has frequently been argued that statistical error measures do not measure the 
right thing. Therefore many authors developed additional evaluation methods 
(Stekler 1991, Leitch and Tanner 1991). In the experimental phase of testing, we 
discovered that even if a network scores high, the results may not be stable or are 
not well suited for forecasting. Therefore we developed a new testing routine for 
networks with two or more layers. 

After extensive testing of possible forms of network architectures, we selected 
a neural network, which can be represented with following equations: 



y = f W (Wj 2 f (W 2 ^ (W 1,12 X)) + b 3 , ) 


(1) 


= purelin(/ 2 ) = « i = 3,1 


(2) 


/^ = tansig(«) = (2 / 1 + ) - 1 J = 2 


(3) 



where y is output vector, f is matrix of z activation ftxnctions of neurons in 
layer i, and ^ is matrix of q weights in layer i. Only output layer of neurons has 
a bias vector, which are represented by b, ^ . As it can be seen, we selected a 

three-layer network. The input layer has three neurons (each with pure linear acti- 
vation function), hidden layer has two neurons (each with non-linear tansig activa- 
tion function), and the output layer has one neuron (with pure linear activation 
function). The selected model performed best for 9, 12 and 15 months forecast. 
Forecasts for longer periods did not produce good results - as we expected, since 
we did not have long time series. 





490 




- FORECAST 



- REFERENCE SERIES 



- FORECAST 

• REFERENCE SERIES (TREND-CYCLE) 





REFERENCE SERIES 

Fig. 1. Performance of neural network model 



REFERENCE SERIES (TREND-CYCLE) 



By using PCA we were able to reduce the input space significantly. Instead of 
58 input variables, we used only seven principal components, since we eliminated 
those principal components that contribute less than 10% to the total variation in 
the data set of input variables. 



The forecast performance of the new model of leading indicators for Slovenia 
is presented in Figure 1. The parameters of the model were estimated on the data 
from 1993:01 to 1997:08. As the upper two graphs in Figure 1 suggest, the model 
capture the dynamics of the reference series well. All turning points were detected 
and the forecasted value follows the dynamics of the reference series. 



The statistical properties of the estimated model are presented in lower two 
graphs. We performed post-regression analysis, were we compared original (tar- 
get) data with forecasted (estimated) data for in- and out-of-sample data. As it can 
be seen, the model captures best the business cyclical frequencies of the reference 
series. The high frequencies are not detected. This was expected, since we use 
principal components - non-measurable indicators, which do not have all the in- 
formation of original leading indicators. In addition, we only searched for linkage 
between reference and leading indicators in business cycle frequencies. 





491 



In future work, we hope to further refine the best neural network models in this 
research (by considering additional types of networks, different training methods, 
etc.) for use as forecasting tools to exploit readily available data in order to gauge 
future economic activity. Forecast comparisons with other models, such as a vec- 
tor error-correction models may also be undertaken. As well, future projects may 
involve the construction of neural net models to forecast other important macro- 
economic variables. 



References 

Bhishop CM (1995) Neural Networks for Pattern Recognition. Oxford University Press: 
New York 

Charemza W,Deadman DF (1992) New directions in econometric practice: General to spe- 
cific modelling, cointegration and vector autoregression. Edward Elgar, Aldeshot 

Granger CWJ (1996) Can We Improve the Predictive Quality of Economic Forecasts? Jour- 
nal of Applied Econometrics 1 1 : 455-473 

Hagan MT, Demuth HB, Beale MH (1996) Neural Network Design. PWS Publishing, Bos- 
ton 

Hinton GE (1987) Learning translation invariant recognition in massively parallel net- 
works. In: Bakker JW, Nijman AJ, Treleaven PC (eds) Proceedings PARLE Confer- 
ence on Parallel Architectures and Languages Europe. Springer Verlag, Berlin 

Jagric T (2002) Measuring Business Cycles. Eastern European Economics 40: 63-87 

Leitch G, Tanner JE (1991) Economic Forecast Evaluation: Profits Versus The Conven- 
tional Error Measures. American Economic Review 81: 581-590 

Mezard M, Nadal IP (1989) Learning in feedforward layered networks: The tiling algo- 
rithm. Journal of Physics A 22: 2191-2203 

Stekler HO (1991) Macroeconomic Forecast evaluation techniques. International Journal of 
Forecasting, 7: 375-384 

Stock JH, Watson MW (1989) New Indexes of Coincident and Leading Economic Indica- 
tors. In: Blanchard O, Fischer S (eds) NBER Macroeconomics Annual. MIT Press, 
Cambridge 





Estimating Multivariate Conditional 
Distributions - An Application to the Truck 
Sales Forecast 



Eric A. Stiitzle and Tomas Hrycej 
Information Mining 

Daimler Chrysler AG, Research Sz Technology Ulm, 
P.O. Box 2360, 89081 Ulm, Germany 
E-mail : Eric . St net zle® Daimler Chrysler . com 



Abstract. A concept for forecasting the conditional multivariate distribution has 
been developed. It allows the forecast of the joint distribution of target variables in 
dependence on explaining variables. The concept can be applied to general distribu- 
tion families such as stable or hyperbolic distributions. The conditional distribution 
parameters axe estimated by a global optimization method, using neural networks 
for functional approximation. The information about a complete distribution of 
forecasts can be used to quantify the reliability of the forecast. A comparison with 
conventional forecasting concepts is done and the additional benefit of forecasting 
conditional distribution in general, and of hyperbolic distribution in particular is 
shown. The concept is illustrated on a case study concerning the future truck de- 
mand. In this application, the distribution parameters are conditional on properties 
of the product and information about existing orders. 

1 Introduction 

It is a widespread practice to understand a forecast for a variable of interest 
as a single number. For example, an inflation rate forecast may amount to 
“three per cent”. The most common meaning of this single number is the 
estimated mean value. 

In contrast to this simplification, the statistical view of forecast implies 
that the forecast variable (or, generally, forecast vector) y is random with a 
certain distribution conditional on explaining variables x [6]: 

f{y\x) ( 1 ) 

For decision making about future actions, every action can be assigned 
a value resulting from this action assuming a certain state of the world (de- 
scribed by y). With a forecast distribution of the state of the world, a decision 
optimal in some sense can be taken. For example, an optimality criterion may 
be the minimum expected loss. Let us define the loss for action a and state 
of the world y, conditioned on explaining variables x, as L{a^ y, x). Then, the 
optimum decision is 

argmin / L{a,y,x)f{y\x)dx, 

Jy 




493 



which is obviously not calculable without knowing the conditional probability 
density function f{y\x). 

Unfortunately, it is not easy to figure out a general law expressing the 
dependence of the distribution on the given attribute vector x. If we had a 
set of cases for each attribute vector with different outcomes, we would be 
able to compute the conditional distribution over this set in a straightforward 
way. But unfortunately this comfortable setting occurs scarcely in practice. 
This is why we have to find out the distribution in a less direct way: as a 
function of the attribute vector x. 

The need for identification of a complete and general conditional distribu- 
tion (1) motivated the development of the forecasting model presented here. 
We are roughly following the concepts developed by [8]. Nonlinear dependen- 
cies are represented by a multi-layer perceptron [4], and estimator parameters 
are identified numerically with help of the maximum likelihood principle and 
a global optimization method [2]. The developed general concept can be used 
under many distribution assumptions, such as the multivariate Gaussian, 
Student, stable [3], hyperbolic distributions [1], or to nonparametric mixture 
distribution approximators ([7]). 

In Section 2 we develop a concept of the underlying general probabilistic 
model. The method to estimate the distribution parameters with the help of 
a global optimization method, neural networks for functional approximation 
and a developed transformation is specified in Section 3 and 3.2. A short 
overview of applicable probability distributions is given in Section 5. The 
application of the truck demand forecast is presented in Section 6. 

2 Probabilistic Model 

2.1 A general model of probability distributions 

To generate distributions from a certain class, it is frequently possible to 
find a ’’canonical form” as a representative of this class, and to derive all 
distributions from this class by a linear transform of the random vector. It 
is important that this canonical density has only few parameters, or even 
is completely parameter-free. This is the way we are going to follow in this 
subsection. 

Suppose there is a input vector x € IR^ and a n-dimensional random 
vector Z with some simple conditional probability density function (PDF) 

g{Z = z). (2) 

This canonical density function may be, for example, a standard multivariate 
Gaussian. Also other density families (stable, hyperbolic) possess such simple 
canonical special cases with few parameters (basic distribution parameters), 
based on which the whole distribution class can be generated with the aid of 
a linear transformation. 





494 



The nonsingular conditional matrix A{x) and the conditional position 
parameter vector /i(rr) form a linear transformation y = /x(x) -I- A~^{x)z, 
which generates the dependent random variables Y{ = yi^i = 1, . . . ,n with 
the following relationship: 

z = A{x){y - n{x)). (3) 

Applying the transformation theorem , we get the conditional PDF / of 
the transformed random variable Y as 

f{Y = y\x) = \A{x)\g{Z = z), (4) 

with |^(a;)| ^ 0 the determinant of A{x). The conditional PDF / in (4) 
represents the whole probability distribution class generated by the canonical 
density g{z) with the aid of (3). 

According to (1) and (3), and A^^^{x) are to be viewed as para- 

metric functions of the input vector x. By existence of basic distribution 
parameters, we obtain a further parametric function Vu}^{x). These mappings 
can be modelled by an appropriate approximator. 

3 Functional Approximation 













Maximum 












Likelihood 




A 


ta 


PA 


= 


f{x,OJopt) 






Tv 










V 




.P^ . 






parameters of the intermediary 


estimator- 


cond. distribution parameters 


parameters 



Fig. 1. The estimation model for the conditional distribution 

Due to functional approximators are real-valued mappings, but the dis- 
tribution parameters are subject to certain restrictions, a transformation r 
of the outputs is required, r is described in Section 3.2. 

For this reason the concept shown in Figure 1 is used. The intermediary 
parameters p = {P^,Pa,Pv) are the outputs of the mapping approximator. 

3.1 Functional approximation with neural networks 

The real- valued mapping maps from the input variable space IR^ to the 

intermediary parameter space 5ji/2n(n+3)+w^ ^ amount of basic dis- 

tribution parameters. 




495 



This mapping can be represented by a multi-layer perceptron, whose out- 
puts are the intermediary parameters p. Suppose the use of a one-layer per- 
ceptron with I hidden neurons. Then the number of perceptron parameters to 
be determined by an estimation procedure amount to ml -1- il/2n(n -f 3) 4- u. 

3.2 Mapping real- valued vectors to distribution parameters 

Following [8], suppose A € is a upper triangular with strict positiv di- 

agonal elements. This is not restrictive under the distribution classes treated 
here. 

Further matrix A is factorized into the following form A = UD where U 
is upper triangular with uu = 1 and D is diagonal. It can be shown , that 
the strictly positive parameters da play the role of scaling parameters. This 
representation has lead to a significant improvement of the fit to the data in 
comparison to the method of [8]. 

The components of the conditional position parameter vector /i(x) are 
arbitrary real values and so the corresponding transformation is, trivially, 
an identity transformation. To enforce the strict positiveness of the diago- 
nal elements of D and thus of A, da are received with help of an exponen- 
tial transformation. Therefore the transformation r = ^ Tan ^ , with 

fJ'i = an = da = TaniPdn) = = l...n 

and aij = duUij = Tai^ipaij) = > h maps the components of the 

intermediary parameters to the position parameter vector ijl(x) and to the 
components of the matrix A = UD. Due to the dependence to the chosen 
distribution class, the transformation to the components of the basis distri- 
bution parameter vector v is not treated in detail. 

4 Parameter Estimation 

4.1 Maximum likelihood principle 

The fitting of the approximator is done via the maximum likelihood principle 
by minimizing the cross entropy. With help of a data set containing K pairs 
which can be viewed as columns of matrices X and F, i.e. X G 
and Y G 

Independent data samples lead to the negative log-likelihood as 

K 

L{X,Y) = -Y,log{f{yk\xk)). (5) 

k=l 

Substituting the conditional PDF of (4) into (5) we obtain the parame- 
terized negative log-likelihood as 

K K 

L{X,Y,u) = - '^log{\A^^{xk)\) -Y^log{g^^{A^^{xk){y - Huiixk)))) (6) 

jfe=l k=l 





496 



4.2 Estimation procedure 

In Equation (6), the log-likelihood function is expressed with help of the 
parameter functions The task of functional ap- 

proximation is solved by numerical optimization methods. 

Minimizing (6) over the parameter set = cj G i? maximizes 

the likelihood. As estimation procedure, the modified global optimization 
method ’’Multi-Level Single-Linkage” has been used. A local method is part 
of the global procedure (e.g. a conjugate gradient method). 

The gradient formula used in the local search follows the chain rule 

dL{X,Y,cj) ^ ^ dL{xk,yk,(^) dp{xk,ui) 
duj ^ dp{xk,io) du} 

The first term in the last summation is computed by taking the corre- 
sponding derivatives of formulas of Section 2. The second term is received by 
usual backpropagation formulas [5]. 



5 Probability Distribution Classes 



The developed concept can be applied to a big variety of probabilistic dis- 
tributions. The well known Gaussian distribution is one special case of the 
canonical PDFs of the general concept above. 

Likewise the multivariate Student, stable or the hyperbolic distribution 
can be applied to this concept. To generate the family of generalized hyper- 
bolic probability distributions, we need to use the spheric form of the PDF 



as 



g{Z = z) = a 



Kx-nl2{a^^^) 






with a = x/a'^ - /{2n)'^^^Kx{x/a^ - and K„ denotes the modi- 

fied Bessel function of the third kind with index u. The parameters have the 
following domain of variation: X,a E IR, P E IR^ with P > oP . 

The parameters a and P and gamma are basic distribution parameters. 



6 Application: Truck Sales Forecast 

The distribution forecast concept has been applied to the demand forecast of 
trucks. The forecast model for the sales of the next month has been identified. 
The illustrated results were received under the assumption of different un- 
derlying classes of distributions and different information levels. Explaining 
variables x have been for example the demand over the past six months (pd), 
seasonality (s) and the existing orders (eo) of the upcoming three months. 
Table 1 shows the results for a homoscedastic model (with only conditional 




497 



mean) as well as heteroscedastic (i.e., with conditional scaling parameter) 
models with several distribution types and various information levels. The 
forecasting error is improved by 45 percent by using available information. 
A further improvement of 21 percent through using conditional variance, of 
76 percent through using stable distribution instead of Gaussian, and of 84 
percent through using hyperbolic distribution. 



Table 1. Forecast improvement due to information, conditionality and non- 
Gaussian distributions 



Distribution 


Information 


error 


homoscedastic Gaussian 


pd 


0.295 


homoscedastic Gaussian 


pd -f s 


0.284 


homoscedastic Gaussian 


pd + s + eo 


0.161 


heteroscedastic Gaussian 


pd + s + eo 


0.127 


heteroscedastic stable distribution 


pd 4- s 4- eo 


0.039 


heteroscedastic hyperbolic distribution 


pd 4- s H- eo 


0.025 



References 

1. Ernst Eberlein and Karsten Prause. The generalized hyperbolic model: financial 
derivatives and risk measures. Mathematical Finance^ 2000. 

2. A.H.G. Rinnooy Kan and G.T. Timmer. Stochastic global optimization meth- 
ods, part i: Clustering methods, part ii: Multi-level methods. Mathematical 
Programming^ 39(l):26-78, 1987. 

3. Svetlozar Rachev and Stefan Mittnik. Stable Paretian Models in Finance. Wiley 
k Sons, Inc., 2000. 

4. D.E. Rumelhart, G.E. Hinton, and R.J. Williams. Learning internal represen- 
tations by error propagation. In D.E. Rumelhart and J.L. McClelland, editors. 
Parallel Distributed Processing^ volume 1, chapter 8. MIT Press, Cambridge MA, 
1986. 

5. D.E. Rumelhart and J.L. McClelland, editors. Parallel Distributed Processing^ 
volume 1. MIT Press, Cambridge, 1986. 

6. Alan Stuart, Keith Ord, and Steven Arnold. Kendall’s Advanced Theory of 
Statistics, Volume II A: Classical Inference and the Linear Model Arnold, Lon- 
don, 1999. 

7. D. M. Titterington, A. F. M. Smith, and U. E. Markov. Statistical Analysis of 
Finite Mixture Distributions. Chichester: John Wiley, 1985. 

8. Peter M. Williams. Using neuronal networks to model conditional multivariate 
densities. Neural Computation, 8:843-854, 1996. 




Application of Techniques of Functional Data 
Analysis to Spectroscopic Data 



Vera Hofer 

Karl-Pranzens University, Graz, Austria 



Abstract. A classification and identification problem is solved in the case of func- 
tional data. Functional data are data records considered as functions rather than 
vectors of observations at different values of the argument. This research deals with 
the spectral lines of rocks, being treated with light in the visible and mid-infrared 
regions. In a first step Principal Component Analysis in the functional data case 
will be applied to spectral lines of rock samples of various rock types to find out, 
whether the spectral lines can be separated. 



1 Introduction 

Functional Data Analysis stands for statistical techniques, being applied 
when the features observed have a characteristic structure [1,2]. We do not 
meassure a scalar but a function. In this study ten rock samples of six rock 
types each were selected and treated with light of various wavelengths in 
visible and mid-infrared region. The reflectivity of this samples was meas- 
sured for these wavelengths in some proper unit. Overlapping all curves does 




Fig. 1. Spectral lines of ten samples of six rock types each - visible region - (left) 
and the mean functions for the six groups (right). 

not suggest separation’s being possible. We cannot find any structure except 
the lines of group six that have a completely different shape than the oth- 
ers. Looking at the mean functions imposes that separation of groups could 
probably be successful. Now the task is to identify each class by its typical 





499 



spectral lines so that by means of spectral lines one can say, which class a 
sample of unknown class membership belongs to. This problem is of great 
interest because mineral aggregates are the most commonly used construc- 
tion materials worldwide. Improving quality control would reduce not only 
production costs but also costs occuring as a result of improper exploitation 
of the material or of environmental impacts of the production process. 

Why is Functional Data Analysis applied in this study? One might sug- 
gest, using techniques of Multivariate Analysis. Then each wavelength cor- 
responds to a treatment. But you have to bear in mind that there are much 
more treatments (predictors) than observations. Furthermore the variables 
are highly correlated so that the matrices, occuring in Multivariate Analysis, 
are badly conditioned [5]. 

2 First Analysis 

If we look at our functions, we see that four steps have to be carried out first: 
bundling smoothing, registration, Principal Component Analysis. We have 
observed 



y{t)=g{t)+s. (1) 

t stands for the wavelength. Our functions g{t), i.e. the spectral lines, are 
arbitrary, that means that their shape is determined mainly by the class 
membership. But there are also other factors affecting the shape of the lines 
such as elements that can be included or not (e.g.: Fe, Si, Hydroxyls), depend- 
ing on the growth conditions in the mines the samples were taken from. This 
fact calls for a careful choice of mines to garantee representative samples. 

Variability is also caused by the meassurement applied. This error e 
(roughness of the curves) is called white noise. As this white noise affects 
the correlation to a high extend, we should try to eliminate it by smooting. 
Besides we are only interested in the characteristic shape of the curves g{t) 
[4]. But before proceeding in this way, we have to eliminate a parallel effect: 
The lines are parallel which can be shown by a scatter plot (Fig. 2). After 
bundling this effect has disappeared. Bundling means that the average dis- 
tance of each curve to either the class mean function or the overall mean 
function is calulated. This is done in a regionwhere the lines are parallel. 

The next step is the elimination of white noise. There are different ways 
of smoothing. Moving average has two disadvantages: Firstly, we loose ob- 
servation points and secondly, we get a discretized function so that we have 
to interpolate afterwards to be able to evaluate the function at any point or 
compute derivatives. This is why smoothing is carried out by regression anal- 
ysis. Polynomial regression is strongly affected by extreme observations. So 
we use spline interpolation that means that the regression function is piece- 
wise a cubic polynomial. If we don’t restrict the regression function, we get 
the interpolation function for all points observed and the squared error is 





500 




zero. Now we can calculate the first two derivatives but the white noise has 
not disappeared. That means that we could not extract the characteristics 
of the curves observed. Therefore we add a penalty to the sum of squares to 
penalize the curvature [3]. 

n 

Y^{Vi- g{ti)}^ -\-a {g”{x)ydx a < h < t 2 < • - < tn < b (2) 

i=i 

To avoid huge systems of linear equations and to be able to use different 
penalties as well as to simplify further analysis such as Principal Component 
Analysis, we want the regression function to be a basis expansion. The basis 
functions j3j{t) are splines with compact support. 

9(t) = 

i=i 



( 3 ) 







501 



Five knots within the interval observed are used for creation of each basis 
functions. So we have to estimate only q Paramters. The number of basis 
functions will depend on the oscillation of the curves. The choice of a too small 
number of basis functions for highly oscillating curves results in a smooth 
curve that has less characteristics (maxima, minima) than the original curve. 








Fig. 4. Basis expansion of the spectral line of the first sample of the first group 
with 100 basis functions (left) and with 10 basis functions (right) - visible region. 



After smoothing the curves could be registered. Registration is a transfor- 
mation of the abscissa to aline the characteristic features of the curves (e.g. 
maxima, minima). In this study the variability in phase is as important as 
the variability in amplitude. So registration was not carried out in the first 
approach. 

3 Principle Component Analysis 

Having estimated our curves for the various samples, we apply the Princi- 
ple Component Analysis. Similar to the multivariate case we look for the 
direction of the greatest variability. 

1 ^ 

max — E subject to ll/3|p == 1 (4) 

i=l 

In the functional case [2] the scores are 

fi = {j3,Xi) = j p{s)xi{s)ds. (5) 

For the mid-infrared meassurements we don’t get clusters by potting the 
scores of the first principle compontent against the scores of the second one. 
After bundling as discribed before, the first, second, third and sixth group 






502 



are clustered whereas we cannot separate the fourth and fifth group although 
they can be distinguished from the others. The use of more basis functions 
doesn’t improve separation. Principal Component Analysis in the visible re- 




Fig. 5. Principle Component Analysis in the mid-infrared region without bundling 
(ten basis functions) (left) and after bundling (right). 



gion shows that there is no difference in separation whether we bundle around 
the common mean function or use the data after smoothing but without fur- 
ther manipulation. By applying Principal Component Analysis to the group- 
bundled data it is also possible to separate group four and group five. 













^ il 


! 




• 






- J 








t 4. 4 








i , 


K * 


* t ■ 


1 1 


' * ^ * 


’>■ Ah" i * 


1 -ij 






! 


T 4 


4 J j j 


1 


4 


; 


'1 




i . 




— I 1 1 



' t i » 4 fi t 



Fig. 6. Principal Component Analysis of the spectral lines in the visible region with- 
out bundling (left) and after bundling around the group mean functions (right). 
In both cases 100 basis functions were used. 



According to the Principal Component Analysis before we see the propor- 
tion of variability which is explained by the first principal component. This is 
93%, together with the second principal component we have explained 98 % 
of variability. In Functional Data Analysis the principal components are func- 
tions. The first principal component is low at the beginning and in the end 









503 





Fig. 7. Proportion of variance explained by each eigenfunction (left), the five prin- 
cipal components (right). 



and higher in the middle. So, samples with high reflectivity at the beginning 
and in the middle of the interval will have high first scores. That means that 
the first principle component discribes variability according to the common 
mean function whereas the second principal component indicates the vari- 
ability of the function values throughout the interval. Those lines will have 
high second scores that form a broad image [2]. 

4 Conclusion 

Separation was possible by using bundling and smoothing. Further steps are 
to optimize the interval where the bundling technique is carried out and 
to implement a cross validation. When assigning a curve of unknown class 
membership to the group, it belongs to, one could proceed stepwise: If the 
curve cannot be alined to group 1 or group 6, the bundling technique has 
to be applied. Besides the combination of various information of the samples 
could sucessfully be done by a Canonical Correlation Analysis. Anyway, a 
Discriminat Analysis should make a quick separation be possible. 

References 

1. Ramsay J. O., Silverman B. W. (2002) Applied Functional Data Analysis. Meth- 
ods and Case Studies. Springer, Berlin Heidelberg 

2. Ramsay J. O., Silverman B. W. (1997) Functional Data Analysis. Springer, 
Berlin Heidelberg 

3. Green P. J., Silverman B. W. (1994) Nonparametric Regression and Generalized 
Linear Models. Monographs on Statistics and Applied Probability 58. Chapman 
and Hall 

4. James G. M., Hastie T. J. (2001) Functional Linear Discriminat Analysis for 
Irregular Sampled Curves. Internet 

5. Hastie T. J., Buja A., Tibshirani R. (1995) Penalized Discriminant Analysis. 
Annals of Statistics, 23, 73 — 102 






Integrating Exchange Rate Theory in Data Mining 



Bemd Brandi 

Department of Government, University of Vienna, BriinnerstraBe 72, A- 1210 
Vienna (bemd.brandl@univie.ac.at) 



Abstract. This paper focuses on an integration of exchange rate theory in a data 
mining process for the purpose of forecasting. The applied approach is centred by 
a Genetic Algorithm (GA) and Neural Networks (ANN), which allows the identi- 
fication of relationships that are not describable by economic theory. As experi- 
ence showed, relationships derived by data mining are often not convincing as re- 
gards their correctness and effectiveness. Most data mining approaches do not 
contribute much to persuade otherwise. In this work, it is tried to remove parts of 
this limitation by combining economic theory with data mining. However, usually 
the role of economic theory in exchange rate forecasting is to identify a list of 
relevant variables to be included in the analysis, with possibly and plausible signs 
of their coefficients. Previous research documents the failure of this way of analy- 
sis and thus for exchange rate theory in forecasting exchange rates at frequencies 
up to one month. Consistent with these results, in this paper the role of economic 
theory is extended as it is implemented in data mining as a framework in which 
and among which the possibilities of data mining are exploited. Other findings in- 
clude: (i) During the years 2000 and 2001 countries relative economic growth is 
most significant in forecasting exchange rates one period ahead, (ii) the financial 
market is of major concern when explaining exchange rate movements as it may 
be used to proxy real economic activity as well as it maps massive capital flows 
between countries, which in turn affect exchange rates, (iii) fundamental forecast- 
ing is more effective on lower frequencies. The approach is illustrated in some de- 
tail for five exchange rates on a daily, weekly and monthly frequency. 



Data Mining 

The definition says that data mining searches for unknown and interesting rela- 
tionships. This is true and it is the standard way data mining is viewed in litera- 
ture. See for example Hand et al (2001). However, for the purpose of this work, 
data mining is defined as the semi-automatic search for relationships among many 
time series which are possibly important to explain future exchange rate fluctua- 
tions. For a similar definition of data mining see for example Chereb (1998). Yet, 
as the last years showed, in economics and finance, the possibilities of data mining 
are seen quite skeptical. Also, because relatively often the results of data mining 
are not startling. Furthermore, economic theory offers many (profound) relation- 
ships and, of course, if relationships are already known it makes no sense to search 
for them again. While, as the definition says, data mining may find undetected re- 




505 



lationships, economic literature often tends to interpret them as temporal or too 
case specific. In general, the relationships uncovered by data mining are seen as 
less apparent and they are expected to be more case specific and valid only tempo- 
rarily. The problem in context with exchange rate forecasts, however, is that fore- 
cast models which are based on economic theory, in general, perform weakly to 
predict exchange rate movements out-of-sample on frequencies (at least) up to one 
month. Even though relationships described by economic theory are often found to 
hold in the long run and thus can be interpreted as a kind of long-run gravitation 
center. This pessimistic view of the possibilities of exchange rate theory is the 
mainstream notion in economic literature since the seminal work of Meese and 
Rogoff(1983). 

Dangers of data mining 

Even though the critique on the relationships found by data mining has to be 
taken seriously, other work showed that the utilization of different data mining 
techniques and data mining approaches, indeed, can be very effective for many 
fields of applications. Own experiences in context with building forecast models 
on basis of fundamentals showed that data mining, indeed, is also very effective in 
providing forecast models with a remarkable high goodness of fit in out-of-sample 
evaluations. Yet, it has to be stressed that sometimes a high goodness of fit was 
valid only temporarily (over specific evaluation periods) and may also be the re- 
sult of data snooping, overfitting and spurious relationships. However, the latter 
problems are inherent questions of statistical forecasting and are almost impossi- 
ble to solve, see for example White (2000), and therefore not only valid in context 
with the data mining forecast models. So, nevertheless, data mining is a very use- 
ful tool of analysis. However, one limitation of data mining is that it uses only in- 
formation that is covered in the data set. From a theoretical perspective this limita- 
tion on the data set available is unproblematic. Whereas, in practice data sets that 
are usually available to exploit unknown relationships are often problematic since 
they are too small or not sufficiently clean. This limitation on the data set avail- 
able is a crucial point for the combination of data mining with economic theory as 
the latter is able to add information that lies outside of too small data sets. 



Using exchange rate theory 

In this work the forecast equations of the Purchasing Power Parity (PPP), the 
Uncovered Interest Parity (UIP), the general Monetary Model, the Dornbusch 
(1976) Overshooting Model and the Frankel (1979) Real Interest Differential 
Model have been used to provide a framework in which data mining not only 
searches among various time series to express the formulated aggregates of ex- 
planatory variables best, but also among which data mining searches for further 
relationships to explain one-period-ahead exchange rate movements. These five 
theoretical approaches have been selected as literature promises them to be most 





506 



effective, or as they are central in the discipline of international economics.^ How- 
ever, applying economic theory constrains the forecast models to behave accord- 
ing to described laws. For example, PPP says that when the price level in country 
A is increasing, with other things held equal, the currency of country A is ex- 
pected to depreciate. This translates into forcing the model to yield a negative co- 
efficient. The data set, however, and data mining results based solely on statistical 
criteria (measured on basis of limited evaluation periods) will permit a coefficient 
different to that from economic theory, depending on the mix of series used in the 
found forecast equation. Within the applied GA the correct coefficient was taken 
into consideration making sure that economic theory is used adequately. Coeffi- 
cients other than those of economic theory would violate economic laws and 
therefore would be more vulnerable to lead to ridiculous results. As mentioned 
above, the forecast equations provided by economic theory are extended by adding 
time series selected by data mining. Because of the dangers of data mining the ap- 
plied data mining approach is based on various steps of analysis which all try to 
filter out unimportant and spurious relationships. Therefore, the applied approach 
consists of a combination of traditional data mining tools such as cluster analysis, 
(rolling) correlation analysis and “newer” tools such as the core of the data mining 
approach, the constrained GA. By applying different methods and techniques of 
analysis it was possible to find time series out of the vast universe of possibly 
relevant time series and thus to reduce the search space in which combinations of 
variables are checked on their effectiveness to explain future exchange rate 
movements. However, after having reduced the universe of possibly interesting 
time series to a pool of time series which have a minimum correlation to the re- 
spective exchange rate, the following (constrained) GA was used. 



Designing the GA 

The applied GA was designed to build exchange rate forecast models that proof 
to be successful in out-of-sample tests, which means that they have to provide a 
forecasting performance better than the naive forecast. To do so, the GA was con- 
straint in such way that the forecast models constructed meet the demands of the 
forecast equations provided by economic theory, but allowed to include other sorts 
of influence, which means additional series than those suggested by economic 
theory. The idea behind to build forecast models which, first, are based on pro- 
found economic relationships, provided by economic theory, that behave relative 
stable over a long time but may temporarily be covered by other sources of influ- 
ence and second, to consider temporal and/or case specific sorts of influences 
which, indeed, can become important in some sub-periods. Therefore in the GA 
fitness fimction a penalty term is included which ensures that the performance of 
the models does not differ significantly over different sub-out-of-sample evalua- 
tion periods. For reasons of space a more detailed discussion of the fitness func- 
tion is forgone. For the exploitation of temporal and/or case specific influences 



^ For recent literature on the applied theoretical approaches see for example Dutt and Ghosh 
(2000), Groen (2000) and Rogoff (2002). For a description of the variables used in the 
forecast equations see p. 5. 





507 



and relationships automated data mining appears to be perfect since it allows (but 
also demands) a continuous check on the validity and correctness of relationships. 
However, the combination of theoretical relationships with additional relation- 
ships has the advantage that the goodness of fit of the theoretical models can be 
raised without endangering that profound relationships offered by economic the- 
ory get lost. To search among combinations between those two sorts of influences, 
in this work a GA was applied. However, to constrain the GA, the search space is 
divided into several clusters or groups from each of which the GA has to select 
variables to build forecast models. Usually one series from each cluster, while the 
clusters are assembled according to theoretical issues. This means for example that 
if economic theory demands a consideration of interest rates in a forecast equa- 
tion, a cluster consisting of a variety of bonds with varying maturity and other se- 
ries to express countries’ interest rate, such as for example prime lending rates or 
interbank rates, is used from which the GA tries to find the most appropriate se- 
ries. A cluster representing a country’s real economic activity usually contains se- 
ries of industrial production, leading indicators such as housing starts, car sales 
and registrations, composite indicators and other series that express economic 
growth. The advantage of using a GA for this optimization task is that many com- 
binations between series from different clusters can be evaluated without loosing 
the structure of the theoretical models. The number of clusters, depends on the 
theory used, and the number of series in the cluster depends on how abstract the 
theory is formulated, as well as on issues such as availability of data on different 
frequencies. Especially on higher frequencies many theories were only applicable 
by using proxies. However, as mentioned, in addition to considering the series in 
accordance with economic theory, other sources of influence are evaluated. Usu- 
ally other exchange rates and financial market series such as stock market indices 
have been taken into account, because they not only are said to have the capacity 
to proxy (or even anticipate) real economic activity but also are mapping capital 
movement between countries, which in turn affect exchange rates (S). This idea of 
constraining the GA can be expressed analytically. The general representation of 
the forecast equation to be optimized is of the following form: 

t /ii kj n-i k„ kj ri\ 



y=o 1=1 y=o i=i y=o i=\ j=o i=i 

With F indicating series from groups according to economic theory and X indicat- 
ing all other series used. Accordingly p denotes the coefficients of the theoretical 
variables and a that of all others, m stands for the number of theoretical clusters, n 
for the number of variables in one cluster and k for the number of lagged series 
considered. The GA is used for the model selection, whereas is constrained to se- 
lect exactly one variable from every theoretical cluster, which means 

Vm 3!/, 7 : ^0- The number of series that can be selected from cluster X is 



unconstrained. As heard the number clusters depends on which theory used. This 
means for PPP that = AP and F^ - AP\ Whereas P denotes time series 
which express changes in the domestic price level and P* in the foreign price level. 
For UIP . Whereas expresses time series which consider the interest 



rate differential between the two countries. For the Monetary Model = M , 
=Y, F^ =Y\ F^ =r and =r*. M denotes time series for 





508 



series for the domestic money supply {M* for the foreign money supply), Y are 
time series which express domestic real economic activity (7 for foreign real do- 
mestic activity) and r, respectively r denotes all time series which express the rate 
of interest in the home country, respectively foreign country. For the Dombusch 

Model = Pratio^ ~ - r . Series for Pratio all express the ratio 

between domestic and foreign price level. Last, for the Frankel Model = M , 

F^ -M* F^ =Y F"^ =Y* F^ =r F^ ^r* F^ =r F^ =r* 

F^ = and F*^ = . Whereas in the Frankel approach the interest rate fac- 

tor is divided into three clusters to consider the real rate of interest (subscript 
real), short term interest rates (subscript short) as well as long term interest rates 
(subscript long) separately. 



Summary of the Results 

Applying the GA within the theoretical constraints not only resulted in consid- 
erably well performing forecast models, but also revealed some interesting fea- 
tures. The performance was evaluated using linear methods (OLS regression) and 
non-linear methods (ANNs). Firstly, the out-of-sample performance achieved on a 
monthly frequency was significantly superior compared to that on the daily and 
weekly frequency. The reasons for this are manifold and can be found in the richer 
availability of data on lower frequencies (on a daily and weekly frequency often 
only proxies were available), but also on the fact that, except for the UIP, the theo- 
ries integrated are not focusing on a daily and weekly perspective, but on much 
longer perspectives. This lead to the result that in general only for the exchange 
rates on a monthly frequency the naive forecast could be outperformed signifi- 
cantly. However, regarding the applicability of the five theoretical approaches the 
general Monetary Model in combination with other sources of influence achieved 
the best performance. The forecast equation of the Monetary Model only in one 
case, namely of the Euro/US dollar on a monthly frequency, was outperformed by 
the PPP approach as theoretical core. Secondly, when looking at the results, it is 
worth noting that during the evaluation periods (the years 2000 and 2001) vari- 
ables expressing relative economic growth between the two countries in the bilat- 
eral exchange rate relationship have been selected very often by the GA and pro- 
vide much explanatory power. This result is especially interesting as it was 
achieved by machinery learning but also reflects the general view in economic lit- 
erature as the main source of influence on exchange rate movements (during that 
evaluation period). Thirdly, it was discovered that aside from the inclusion of 
variables expressing real economic activity, data mining also resulted often in the 
inclusion of stock market indices. Interesting in context of the importance of vari- 
ables expressing economic growth is that the presence of stock market indices can 
be explained by the fact that the stock market is said to exhibit predictive power 
for real economic activity. However, the application of the constrained GA in- 
creased the out-of-sample performance of the theoretical approaches of exchange 
rate determination substantially. In conclusion, the combined approach, in general, 
resulted in a better out-of-sample performance than both approaches applied sepa- 




509 



rately. By combining the two approaches (data mining and economic theory) it 
was possible to construct forecast models on a monthly frequency which behave 
stable over time and have a high goodness of fit. 



References 

Chereb, D., 1998, Does data mining improve business forecasting?. Presentation 
at the 18^ international symposium on forecasting, Edinburgh, Scotland 
Dombusch, R., 1976, Expectations and exchange rate dynamic. Journal of Politi- 
cal Economy 84:6, 1161-1176 

Dutt, S.D., Ghosh, D., 2000, An empirical note on the monetary exchange rate 
model. Applied Economics Letters 7, 669-671 
Frankel, J.A., 1979, On the mark: A theory of floating exchange rates based on 
real interest differentials, American Economic Review 69:4, 610-622 
Groen, 2000, The monetary exchange rate model as a long-run phenome- 
non, Journal of International Economics 52, 299-319 
Hand, D., Mannila, H., Smyth, P., 2001, Principles of Data Mining, MIT Press, 
Cambridge, Massachusetts 

Meese, E., Rogoff, K., 1983, Empirical exchange rate models of the seventies: do 
they fit out of sample?. Journal of International Economics 14, 3-24 
Rogoff, K., 2002, Dornbusch 's overshooting model after twenty-five years, IMF 
Working Paper, WP/02/39 

White, H., 2000, A reality check for data snooping, Econometrica 68, 1097-1126 





The Ability of Artificial Neural Networks to Exploit 
Non-Linearities by Data Mining Models Compared 
to Statistical Methods 



Lutz Beinsen and Bemd Brandi 

Department of Economics, University of Graz, Universitatsstralie 15, A-8010 
Graz, Austria, lutz.beinsen@uni-graz.at / Department of Government, University 
of Vienna, BriinnerstraBe 72, A-1210 Vienna, Austria, bemd.brandl@univie.ac.at 



Abstract. This paper discusses the question whether Artificial Neural Networks 
(ANNs) have the capacity to exploit (additional) non-linear information on the ba- 
sis of exchange rate forecast models selected on linear criteria. This includes a 
comparison of exchange rate forecasts between ANNs and linear statistical meth- 
ods and it is asked whether linear relationships serve as an approximation. The fo- 
cused forecast models are selected by a data mining approach which combines 
fiindamentals from economic theory, respectively building blocks from economic 
theory, with “fundamentals” derived solely by statistical criteria. This combination 
of theoretical and statistical relationships in data mining makes sure that both long 
run determinants on exchange rate behaviour as well as current influences are in- 
tegrated in the forecast models. The results are evaluated for five exchange rates 
on a monthly frequency. The results favour the use of ANNs as they slightly im- 
prove the out-of-sample performance of the forecast models. What is more, linear 
as well as non-linear methods can be applied and the advantages from both meth- 
ods can be used, which means that statistical methods allow a more detailed analy- 
sis of the results whereas ANNs offer a slightly better forecast performance. 



Introduction 

For almost twenty years now research on ANNs in context of exchange rates 
can be found in economic literature. At the same time, forecasting exchange rates 
has become a challenge among economists and the use of new methods, such as 
ANNs, has emerged as one of the most exciting areas of the discipline. However, 
during the last years one could also witness an interesting reorientation in research 
and thinking about the nature of exchange rate behaviour. Throughout the last two 
decades, numerous papers have been produced on comparisons between tradi- 
tional linear methods and non-linear methods such as ANNs. See for example 
Rehkugler and Poddig (1990), Franses and van Griensven (1998) and Zhang et al 
(2001). However, such research has shed new light on the dynamics and notions 
between fundamentals and exchange rates. Nevertheless, non-linear analysis is not 
rarely accompanied by a variety of problems and disadvantages, especially in the 
process of model selection, which traditional (linear) methods and procedures do 
not show. “Traditional methods” and “new methods” both reveal advantages as 




511 



well as disadvantages, so the question which method is strictly to prefer in most 
cases remains open. 



The forecast models 



The question of the future behaviour of exchange rates is interesting not only 
for market practitioners but also for theorists. Because of the substantial macro- 
economic effects of exchange rate movements, it is not surprising that several 
theories, accompanied by a variety of statistical methods have been applied. It has 
to be stressed that, though, since the work of Meese and Rogoff (1983) economic 
literature, in general, sees most exchange rate forecasting attempts as relatively 
unsuccessful compared to the effectiveness of the naive forecast. This is still the 
mainstream notion in literature, even though recent literature rather frequently 
claims to beat the random walk hypothesis. However, within this article a discus- 
sion on the variety of existing approaches of forecasting exchange rates is fore- 
gone. As mentioned within this paper fundamental forecast models are used which 
are derived by a combination of data mining with economic theory. To be more 
precise, two popular approaches to explain exchange rate movements, that is in 
their expression as forecast equation, namely the Purchasing Power Parity (PPP) 
and the so called monetary approach are used as a core among which a Genetic 
Algorithm (GA) is applied to extend these theories by additional explanatory vari- 
ables. 



PPP and the monetary approach 

The monetary approach states that since the exchange rate is the relative price 
of two currencies, it has to reflect the willingness to hold these monies. The de- 
terminants therefore should be the supply and demand for these monies. Economic 
theory tells us that demand for money depends on the main economic variables 
such as income, prices and interest rates. Hence these variables should indicate 
changes in nominal exchange rates and therefore are constituting the forecast 
equation. The theory of PPP states that exchange rates between currencies are in 
equilibrium when their purchasing power, or the amount of goods and services 
that one unit of money can buy, is the same in each of the two countries. This 
means that the exchange rate between two countries should equal the ratio of the 
two countries’ price level of a fixed basket of goods and services. When a coun- 
try’s domestic price level is increasing (i.e. a country is experiencing inflation), 
that country’s exchange rate must depreciate in order to return to PPP. However, 
in absence of transportation and other transaction costs, competitive markets will 
equalise prices of identical goods and services in two countries when the prices 
are expressed in the same currency. So, according to the relative simple PPP ap- 
proach the parsimonious forecast equation consists only of the two countries’ price 
levels. Hence, as in real world there are transportation and transaction costs which 
may disturb or weaken such relationships and may provoke non-linearity. See for 
example lannizzotto (2001) for evidences regarding non-linearity in PPP adjust- 





512 



merits. However, in this sense it appears to be appealing to search among such re- 
lationships for non-linearity, which possibly may improve results. 



The data mining extension 

The fact that there is correlation, dependency and interaction between variables, 
other than those cited in economic theory and integrated in the forecast equations, 
is obvious and newer literature showed that such variables can have an important, 
and statistically significant, bearing on the determination of exchange rates in out- 
of-sample tests. However, the role of such alternative variables provokes heavy 
discussions in exchange rate forecasting literature. Nevertheless, as the sources of 
influence are manifold and the relationships on the Foreign Exchange market of- 
ten are abstruse, complex and temporary, data mining can be very powerful to de- 
tect current relationships in data. To build the forecast models, in this work, a GA 
is applied, which has the capacity to find such (unknown) relationships and pat- 
terns between independent variables and dependent variables, but which is con- 
strained to search for those variables consistent with the forecast equations from 
economic theory. Additional to the use of a GA typical data mining methods such 
as cluster analysis and (rolling) correlation analysis have been applied in order to 
reduce the GA search space on a computable size. However, a principle idea in the 
process of finding exchange rate forecast models was to combine economic theory 
with data mining. The reason for this combination is the attempt to find and con- 
sider different sources of influence on future exchange rate behaviour. This means 
that the variables in the forecast equations are selected by theoretical criteria 
and/or statistical criteria to form a broad basis of influences. As experience 
showed, the combined approach performs better than both others applied sepa- 
rately. 



Input selection on basis of linear criteria 

In economic literature usually linear models have been used to model relation- 
ships between economic variables. This way of doing can be justified by the fol- 
lowing facts: Linear models represent in many cases a reasonable approximation 
of non-linear models, whose functional form in general is not known a priori. Due 
to the lack of computing techniques it is very costly to specify and estimate com- 
plex non-linear models. Nevertheless, in order to make exchange rate forecast 
models to beat the naive forecast, a linear description of the dependencies between 
variables may not be sufficient. However, as already mentioned, by the use of 
cluster analysis and (rolling) correlation analysis the GA search space was mini- 
mised to a computable size. Nevertheless, as ANNs demand an exponentially 
higher degree of computational capacity than traditional statistical methods, the 
GA fitness fimction was set on basis of linear regression. Apart from computa- 
tional reasons for the use of linear criteria instead of non-linear criteria another 
advantage of regression analysis in the process of model selection is that it can use 
statistical tests to show the importance of each variable. In addition, linear regres- 
sion analysis is a very flexible method with well-known properties. Moreover, it is 
easy to understand and it explains its results. These are principle advantages of re- 
gression analysis, but with the disadvantage that non-linear relationships are for- 





513 



gone. Whereas the question is whether there is non-linearity in data or not, and if 
so, how can these relationships be exploited without superproportional increasing 
computational capacity. Therefore the applied approach of selecting forecast mod- 
els on linear criteria and then applying ANNs promises to be more effective. Es- 
pecially ANNs, not demanding strong assumptions about the nature of the data as 
traditional statistical techniques do, appear to be adequate as they may use the lin- 
ear relationship as approximation to exploit additional non-linearity in data. 
Moreover, as recent literature stressed, there are empirical as well as theoretical 
evidences for their existence. See for example Ma and Kanas (2000) for a recent 
investigation on non-linearity in the context of exchange rates. However, beside 
the existence of transportation costs and transactions costs, as recently stressed in 
literature, the use of non-linear methods can be especially fruitful in case of the 
existence of phenomena such as rational bubbles, stop-loss trading strategies of 
traders, interventions by central banks and noise traders, as all four are sources of 
non-linearity in data.^ So, a comparison of the out-of-sample forecast performance 
of ANNs in comparison to statistical methods should show whether non-linearity 
can be found and in case of finding it if the former can improve forecasts. 



Are ANNs able to find additional non-linearity? 

ANNs are highly sophisticated computer applications that borrow features from 
the physiology of the human brain which enables them to recognise patterns from 
large amounts of data. They are made up of simple processing elements called 
neurons named after the most basic cells in the human brain. By adding connec- 
tions between these artificial neurons an ANN is created. ANNs have some advan- 
tages that suggest them to have the capacity to find additional non-linearity. First, 
using an ANN does not require making strong assumptions about the nature of the 
underlying time series as traditional statistical techniques do. In this sense they are 
more flexible. Second, and maybe more important in this context, an ANN does 
not require the user to choose a priori what variables are important, what variables 
provide non-linear information and how the functional form has to be. The net- 
work itself should indicate which variables comprehend non-linear information 
and how this non-linearity can be incorporated in the whole set of explanatory 
variables in a model. So, ANNs seem to be ideal to advance models if linear rela- 
tionships found are only an approximation. Therefore a comparison between a lin- 
ear regression analysis and an ANN analysis should show whether it is possible to 
exploit additional non-linear information behind existing linear relationships. 



^ For recent literature on empirical as well as theoretical evidences regarding the existence 
of non-linearity in connection with exchange rate modeling see also Boero and Marrocu 
(2002), Guerra (2001), Mahajan and Wagner (1999), Qi and Wu (2001), Sarantis (1999), 
and Taylor and Peel (2000). 





514 



Empirical results 

The empirical results of the out-of-sample one step forecasts have been evalu- 
ated for five exchange rates on a monthly frequency.^ The statistical measures are 
expressed in Table 1 and 2.^ 



Table 1. Econometric forecast results 





ME 


MSE 


CORR(F,R) 




HR 


Euro/us dollar 


-0.0026 


0.0009 


-0.0934 




53% 


Pound/us dollar 


-0.0035 


0.0004 


-0.1619 


HR 




YenAJS dollar 


-0.0030 


0.0015 


-0.5938 




50% 


Euro/Pound 


-0.0036 


0.0005 


-0.2193 


HQ 


53% 


Euro/ Yen 


-0.0069 


0.0015 


-0.5341 




53% 



Table 2. ANN forecast results 





ME 


MSE 


CORR(F,R) 


R- 


HR 


Euro/us dollar 


-0.0026 


0.0009 


-0.0521 


0.0510 


73% 


Pound/us dollar 


-0.0023 


0.0003 


-0.1759 


0.1986 


63% 


Yen/us dpllar 


0.0008 


0.0009 


-0.0663 


0.2088 


60% 


Euro/Pound 


-0.0036 


0.0004 


-0.0457 


0.1965 


70% 


Euro/ Yen 


-0.0005 


0.0015 


-0.0477 


0.0630 


60% 



To assess the difference between the non-linear and the linear approach a vari- 
ety of statistical measures are compared. Although the predicted values track fairly 
closely along each other the hitrate of the direction of change is substantially 
higher with the ANN while the other measures not rarely are better in the linear 
approach, especially for the Euro/Yen exchange rate. However, the empirical out- 
of-sample one-step ahead forecast results of this work compared the ability of 
ANNs to exploit non-linearity of exchange rate forecast models on a monthly fre- 
quency selected on linear criteria. The direct comparison between ANNs and sta- 
tistical methods reveals that ANNs are able to outperform the statistical proce- 
dures in terms of commonly applied performance measures. We conclude that 
ANNs offer an alternative and often superior predictive capacity over traditional 
statistical methods but are stressing the effectiveness and importance of statistical 
methods for selecting, interpreting and analysing. As the improved performance of 
the non-linear ANN approach showed, indeed, linear models and linear forms of 
analysis can be used as they complement each other. Statistical models and meth- 
ods allege good approximations on which non-linear methods such as ANNs can 
build up even better performances. 



^ All data is provided by Reuters Terminal and the length of series used on a monthly fre- 
quency was 1991-02-28 - 2001-09-28. The out-of-sample evaluation periods were the 
last 30 periods back from 2001-09-28. 

^ ME {Mean Error), MSE {Mean Square Error), CORR(F,R) {Correlation between Fore- 
cast and Residuals), R' {R-square), HR {Hitrate) 









515 



References 

Boero, G., Marrocu, E., 2002, The performance of non-linear exchange rate models: a 
forecasting comparison. Journal of Forecasting, forthcoming 
Franses, P.H., van Griensven, K., 1998, Forecasting Exchange Rates Using Neural Net- 
works for Technical Trading Rules, Studies in Nonlinear Dynamics and Econometrics, 
2:4, 109-114 

Guerra, R., 2001, Fundamentals and Exchange Rates: How About Nonlinear Adjustment? , 
Working Papers 01.04, Department of Economics, University of Geneva 
lannizzotto, M., 2001, Exchange rate misalignment and nonlinear convergence to purchas- 
ing power parity in the European exchange rate mechanism. Applied Financial Eco- 
nomics 11:5, 51 1-526 

Mahajan, A., Wagner, A., 1999, Nonlinear dynamics in foreign exchange rates. Global Fi- 
nance Journal 10:1, 1-23 

Ma, Y., Kanas, A., 2000, Testing for a nonlinear relationship among fundamentals and ex- 
change rates in the ERM, Journal of International Money and Finance 19, 135-152 
Meese, E., Rogoff, K., 1983, Empirical exchange rate models of the seventies: do they fit 
out of sample?. Journal of International Economics 14, 3-24 
Qi, M., Wu, Y., 2001, Nonlinear Prediction of Exchange Rates with Monetary Fundamen- 
tals, Working Paper 

Rehkugler, H., Poddig, T., 1990, Statistische Methoden versus Kunstliche Neuronale Netz- 
werke zur Aktienkursprognose: Fine vergleichende Studie, Bamberger Betriebswissen- 
schafitliche Beitrage 73, Universitat Bamberg 1990 
Sarantis, N., 1999, Modelling non-linearities in real effective exchange rates. Journal of In- 
ternational Money and Finance 1 8, 27-45 

Taylor, M.P., Peel, D.A., 2000, Nonlinear adjustment, long-run equilibrium and exchange 
rate fundamentals. Journal of International Money and Finance 19, 33-53 
Zhang, G.P., Patuwo, B.E., Hu, M.Y, 2001, A simulation study of artificial neural networks 
for nonlinear time-series forecasting. Computers & Operations Research 28, 381-396 





Preference Measurement with Conjoint Anaiysis 
and AHP: An Empiricai Comparison 



Roland Helm, Laura Manthey, Armin Scholl, Michael Steiner 
Friedrich-Schiller-University Jena, 

{roland.helm, l.manthey, a.scholl, m.steiner}@wiwi.uni-jena.de 



Summary: Conjoint analysis (CA) and analytic hierarchy process (AHP) are common 
methods for measuring preferences with CA dominating marketing research and practice 
and AHP becoming more and more relevant as a tool of decision analysis. 

Both methods mainly differ with respect to their basic conception: AHP is a compositional 
method whereas CA is designed in a decompositional manner. Our study aims at comparing 
the methods as instruments of preference measurement on a fair basis therefore being de- 
signed as similar as possible. As decision problem we use the question which university to 
prefer or how to design preferable universities, respectively. The results show a high degree 
of predictive and convergent validity of both methods. However, inspecting the results in 
detail reveals considerable differences which indicate that AHP performs slightly better. 



1 Introduction 

Measuring the subjective preference of individuals and groups of individuals is an 
important task in several scientific disciplines like psychology, behavioural re- 
search, and economics. From the economic point of view preference measurement 
is a central issue both in decision analysis and marketing. 



Table 1. Preference measurement in decision analysis and marketing 





decision analysis 


marketing 


problem 


selection of an alternative 


design of products/services 


objective 


maximal subjective utility 


maximal profit 


core problem 


modelling and measuring preferences 


selection of 
methods 


scoring methods 
multiattribute utility theory 
analytic hierarchy process (AHP) 


self explicated methods 
multidimensional scaling 
variants of conjoint analysis (CA) 



Table 1 contrasts both fields concerning the role of preference measurement. 
Within decision analysis the problem consists of selecting a utility maximizing al- 
ternative among a set of feasible ones, whereas marketing aims at designing prod- 
ucts and services which promise a maximal profit within a market or market seg- 
ment. In both cases the most critical task is to model and measure the preferences 
of decision makers and/or customers (DM for short). 

Each of both research areas has developed its own set of methods for measuring 
preferences based on different attributes or criteria being important forjudging al- 
ternatives. Despite the obvious similarity mentioned above only a few researchers 





518 



have recognized and discussed this correspondence (Wind and Saaty 1980; 
Tscheulin 1991; Mulye 1998). Table 1 shows some typical methods in both areas. 

Considering the similarity of the problems to be solved, their importance and 
the almost complete separation of the research in both fields it seems to be very 
interesting to examine methods from both areas in a common problem context. 
Therefore, we conducted an empirical study comparing AHP and CA in order to 
find out whether both methods reach similar (converging) results or which method 
gives the better ones. 

As decision problem we consider the task of choosing the individual best uni- 
versity (from the decision analytic point of view) and designing the most attractive 
university (from the marketing point of view), respectively. We call this problem 
the university selection problem (USP). 

In Sect. 2, descriptions of AHP and CA are given. Sect. 3 presents the design of 
the study and a selection of results. Concluding remarks are given in Sect. 4. 



2 Selection and Description of the Methods 

For our analysis, we choose AHP and CA, because both are popular and success- 
ful methods in decision analysis and marketing, respectively. AHP has wide appli- 
cations in different areas of decision making (Zahedi 1986; Vargas 1990). CA is 
the most used research method for measuring customer preferences in marketing 
with many real-world applications (Wittink et al. 1994; Green et al. 2001). 

In order to perform a fair comparison and to find out fundamental methodical 
differences between AHP and CA we choose a basic version of each method, 
which has proven to be successful, such that their similarity is as large as possible. 

2.1 Basic Version of AHP 

From the marketing point of view, AHP is a compositional method transforming 
and combining elementary preference judgments to total utilities of alternatives, 
which may solve the problem. In the following, we give an outline of the essential 
steps of AHP (building a hierarchy of attributes, computing relative utilities of at- 
tribute levels and relative weights of attributes, and deriving total utilities). For 
more detailed descriptions see Saaty (1980) and Helm et al. (2002). 

At first, the attributes being relevant for judging alternatives have to be deter- 
mined and organized within a hierarchy. At the top level, the main objective of the 
decision problem has to be specified and decomposed into several second level at- 
tributes. Each attribute may also be subdivided generating a tree of attributes with 
an arbitrary number of levels. The subdivision stops as soon as elementary attrib- 
utes are found which may directly be used for evaluating alternatives. 

At the lowest level of a complete hierarchy all alternatives are arranged each 
being connected to every elementary attribute. In an incomplete hierarchy only the 
different levels of the attributes are integrated in the hierarchy and connected to 
the respective attribute. Fig. 1 shows an example of an incomplete hierarchy with 
three levels thereby concretizing our design of the USP used within the study. The 
attributes A=l,2,3,4,6 have M=3 and the attribute 5 has Ms=2 levels. 




519 




Fig. 1. Hierarchy of Attributes for University Selection Problem 



Table 2. Saaty's scale 



verbal judgment 






fx.and k are equal 


1 


1 


f weakly better 


3 


1/3 


strongly better 


5 


1/5 


very strongly better 


7 


1/7 


f-jjs absolutely better 


9 


1/9 



In order to compute utilities and weights, the hierarchy is evaluated in a bottom up 
manner by comparing all pairs of elements being related to the same element at 
the next level. 

At first, the decision maker (DM) 
compares every attribute level j=\,...Mh 
with each other for all elementary attributes 
A=l,. . For measuring the relative prefer- 
ence the DM has for an attribute level j 
versus a level k, the 9-point ratio scale of 
Saaty (1980, ch. 1-4) is used. It transforms 
verbal judgments by the DM into priority 
ratios (Table 2). Since is defined as a ratio, the reciprocal value expresses 
that k is preferred over y, i.e. = 1/v^^ for all pairs (j,k). 

By means of these ratios we can compute relative utility values u^. (with sum 1) 
of the attribute levels j with respect to an attribute h by the relation u^jlu^,^ = for 
all pairs (jjc) of levels of h (note that for each h another matrix V={Vj^ is built). 

However, this computation requires the consistency condition Vy^ = Vy^ • to 
be fulfilled for all triplets of levels (j,i,k). To measure the usual inconsistency of 
human judgments, Saaty (1980, ch. 3) proposed the computation of the largest ei- 
genvalue for the comparison matrix V. The degree of inconsistency can be 
measured by the consistency ratio CR which relates the difference X^-M^ to the 
average difference of random matrices with dimension M^. If Ci? < 0.1 , the matrix 
is considered to be sufficiently consistent, otherwise the DM should rework it. 

A vector of reasonable relative utilities is obtained from normalizing the eigen- 
vector connected to the largest eigenvalue (Saaty 1980, ch. 7). 

In an analogous manner, relative weights of the attributes h can be computed 
level-by-level starting with elementary attributes. For this purpose, the scale of 
Table 2 has to be modified such that it verbally expresses importance relations 
transformed into importance ratios 

Finally, the utilities and weights computed while traversing the hierarchy can 
be used for computing total utilities of alternatives. Within our study, we collected 
preference data based on an incomplete hierarchy. Therefore we are interested in 
partial utilities (part-worths) of all attribute levels in order to be able to evaluate 
real universities (alternatives) with pre-specified levels of the different elementary 




520 



attributes. The part- worth of level of attribute A=l,...^ is given by 

U ly- = • Uf^j . In its basic form used here, AHP utilizes a linear additive utility 

function. This means that the total utility of an alternative is computed by sum- 
ming up the part-worths of the active levels j of the elementary attributes h. 

2.2 Basic Version of CA 

CA is a decompositional method that asks for general judgments on alternatives 
{stimuli, products), which are decomposed into part-worths for single attribute lev- 
els. It is a general concept being applicable in many variations (Green and Sriniva- 
san 1990, Mulye 1998, Hensel-Bomer 2000). 

We only describe a basic version of CA which is well suited to performing a 
fair comparison to AHP (for details see Helm et al. 2002). To rebuild the basic op- 
eration within AHP, we choose a full profile approach with paired comparisons of 
given stimuli as evaluation tasks within CA (Hausruckinger and Herker 1992). 

In the first step of CA, the attributes and their levels have to be defined. This 
step is crucial for the success of a study. Therefore, we performed a preliminary 
study based on several methods to determine the six most important attributes 
(Helm et al. 2002). The selected attributes and their different levels are also used 
for AHP (cf. Fig. 1). 

The total number of stimuli which may be designed by combining the different 
levels of all attributes is 3^*2^ = 486. This is far too much to be evaluated by any 
DM, because there exist (486 -485)/ 2 =117,855 pairs of different alternatives. In 
order to obtain a manageable set of stimuli and pairs, respectively, we apply stan- 
dard reduction techniques based on principles defined by Addelman (1962). In this 
study a 3^ Addelman basis plan is applicable which contains 18 stimuli. 

The pairs of stimuli to be compared are de- 
fined by a difference design with 24 compari- 
sons. The design has been constructed using a 
randomised procedure based on the 
construction principles given in Hausruckinger 
and Herker (1992). Within this design, the 
DMs have to evaluate utility differences 
between stimuli i and k on a, bipolar rating 
scale. In order to get the best possible 
similarity with AHP, a nine-point scale is used 
(Table 3). 

In the last step, the part-worths of all attribute levels are estimated. As indi- 
cated by the scales of the attributes we employ an ordinary least square (OLS) re- 
gression which is based on an additive linear utility function. 

Using the part-worths, the best possible stimulus (e.g. an ideal product) may be 
derived for any individual by setting each attribute to its level which has the larg- 
est part- worth. 



Table 3. Bipolar 9-point scale 



verbal judgment 






K _and k are equal 


0 


0 


it js weakly better 


1 


-1 


f /_is strongly better 


2 


-2 


i ijs very strongly bet- 
ter 


3 


-3 


Hjs absolutely better 


4 


-4 




521 



3 Design and Results of the Empirical Study 

We consider the decision problem of selecting a university for studying economics 
{USP). The study was organized as a paper-and-pencil interview with more than 
300 students of economics from the university of Jena (Germany) participating. 
After eliminating incomplete or incorrect questionnaires, a sample of 232 DMs 
remained which served as basis for our analysis. In order to examine and equalize 
sequence effects, the methods were presented in different orders, i.e., one half of 
the DMs started with AHP, the other one with CA. 

To evaluate the results of AHP and CA on an objective basis, a reference 
method (RM) is needed which gives realistic preference information. For this pur- 
pose, we used a direct rating method: While AHP and CA were based on fictitious 
universities only being characterized by certain attribute levels, RM asked DMs to 
express preferences on more realistic alternatives by distributing a total of 100 
points among six real German universities. Because of the different levels of 
knowledge students have about the real properties of the universities considered, 
they were asked to assign attribute levels to each university in order to reveal then- 
basis of judgment. Due to the increase in realism and the higher level of informa- 
tion, it can be expected that RM measures the real preferences with sufficient reli- 
ability despite of the methodological weaknesses of direct rating approaches. 

The main goal of preference elicitation methods is to get a valid and reliable 
model of the DM's preference structure. In the following, we consider different 
measures for judging and comparing the methods with respect to several types of 
validity (Green and Srinivasan 1990) and discuss the results. 

Predictive Validity: Comparing AHP and CA to RM Table 4. Hit rates 

may give insight into their ability to model the real 

preferences of a DM and to predict his/her behaviour f ^ 

{predictive validity). Taking the ranking of universities HRi 54.3 57.3 

given by RM as a surrogate of the reality, the predictive HR12 31.0 35.3 

validity may be judged on the basis of several hit rates. t- 23.7 28.0 

HRl measures the frequency with which AHP or CA HR123 

generates the same first-ranked university as RM. Both 

methods show high hit rates indicating a good predictive performance with AHP 
being slightly better. This impression is intensified considering the requirement of 
reproducing the first two ranks (HR12) or first three ranks (HR123) of RM. This 
finding is supported by further measures including Spearman’s coefficient of rank 
correlation (Helm et al. 2002). 

A quite different way of evaluating a method's predictive validity is to examine 
its ability to fulfil obvious preference statements of the DM. A very evident as- 
sumption is that a rational DM always prefers an alternative that dominates all 
other ones. For performing such a dominance test, we construct sets of realistic al- 
ternatives one dominating each other based on elementary and reliable statements 
asked for within AHP and CA, respectively. Via the part- worths and pro- 

vided by CA and AHP the total utilities of all alternatives within the test can be 
computed. Whenever the dominant alternative gets a lower value than any domi- 
nated alternative, the method examined has failed the test. 





522 



• AHP-based dominance test: AHP directly asks a DM for comparing the 2 or 3 
levels of every attribute with each other. From those statements it is possible to 
deduce a reliable ranking of the levels for each attribute separately. Now we 
construct a realistic alternative X which shows the second-ranked levels of each 
attribute except for h=5 where it gets the finst-ranked level (due to Ms=2). Six 
dominated alternatives are derived from Xhy setting attribute h=l,...,6to the 
worst level, respectively. Whenever a method prefers at least one to X, it is 
not able to reproduce the most elementary preference statements of the DM and 
fails the test. PA denotes the frequency of failing these tests. 

• CA-based dominance test: Within CA, 24 paired comparisons of stimuli {i,k) 
are performed where DMs are directly asked for their relative preference. Due 
to this explicit statement, any pair constitutes a reliable dominance test with i 
dominating k, k dominating i or both judged as being equal. For both tested 
methods, we check whether or not the pre-defined relationship is reproduced by 
the computed total utilities. Let PC be the frequency of failing these tests. 

Table 5 shows that CA prefers a dominated alternative Table 5. Dominance tests 

for more than 73% of all DMs. By construction, AHP 

passes all such tests, because the partial utilities are CA AHP 

monotonically decreasing within the ranking con- PA[%] 73.3 0.0 

structed. The bad result of CA is due to the PC[%]f 17.5 28.5 

compositional approach where complex evaluation 

tasks are given to the DMs. The majority of them do not succeed in judging trade- 
offs such that the elementary relationships on attribute levels are correctly re- 
flected. Considering PC we recognize that CA performs better than AHP but is not 
able to fulfil all preference orderings that serve as input for the estimation of part- 
worths. Possible reasons are inconsistent statements of DMs, the complexity 
within the decompositional approach and the linear utility model used. Therefore, 
PC can be interpreted as a measure for internal validity in case of CA. 

Internal validity: Generally speaking, the internal validity aims at evaluating the 
internal information processing within a preference elicitation method. One aspect 
is the consistency of preference statements achieved. As described in Sect. 2, a 
consistency ratio CR is computed for each comparison matrix of AHP with CR > 
0.1 indicating inconsistent judgments. In an interactive application of AHP such a 
matrix is given back to the DM for modification until it fulfils the consistency 
condition. In our paper-and-pencil study, this was not possible. Therefore, we had 
to accept all preference statements during the interview and use CR only as a 
means for an ex post evaluation of the consistency achieved. In order to evaluate 
the degree of consistency of the entire hierarchy of a DM, we use the arithmetic 
mean of the consistency ratios CR of all comparison matrices. An alternative way 
not applied here is proposed by Saaty (1980, ch. 4.5). 

Both measures and their thresholds are difficult to interpret. In order to give an 
alternative, which is directly derived from the consistency conditions of AHP, we 
define a geometric consistency index GL For each subset of three attributes or at- 





523 



tribute levels {j,Uk} the consistency condition Vy^ = v ^ is considered defining 



a ratio gy,.^ = 



min{vji,,vji-vii,} 



< 1 with perfect consistency indicated by gy(t=l. 



max {vji,,Vji -v^^} 

The geometric average of all gjik within any comparison matrix V leads to an 
index GI(V). To judge the overall consistency of a DM's preference statements the 
geometric average on all GI(V) is built to compute the index GI. In order to ex- 
clude only DMs who did not cope with the evaluation task or were not willing to 
give useful answers we set G/> 0.35 as threshold for acceptable consistency. 

In case of CA, the standard measure for internal validity is the degree of deter- 
mination R^, which is computed within the regression anyway. A value close to 1 
shows that it was possible to find part-worths that match the input relationships 
well. As a threshold for acceptable consistency we set a value of 0.8. 



Table 6. Consistency indices 



Table 6 shows averages of the measures on the 
entire sample and the quotas of DMs considered as 
sufficiently consistent. While and GI indicate 
that most of the DMs were able to cope with the 
evaluation tasks sufficiently, CR proposes to 
eliminate the majority of DMs from the study. 

However, within a marketing research context 
AHP should be able to get reasonable results even 
for DMs that do not fulfil the hard requirements set by CR. Indeed, further investi- 
gations reveal that both methods are robust in the sense that there is no fimdamen- 
tal difference between results obtained for a small subsample of very consistent 
DMs and the entire sample. For respective results see Helm et al. (2002). 



CA 




arith. 0 
R^>0.8 


0.83 

74.1% 


AHP 


CR 


arith. 0 
C/?<0.1 


0.13 

40.1% 


GI 


geom. 0 
G/>0.35 


0.41 

68.5% 



Convergent validity: We directly compare the results obtained by CA and AHP 
when evaluating the set of six real universities: For 68.9% of all DMs the first- 
ranked universities are identical. Completely the same ranking of alternatives is 
obtained in 15.9%. For the remaining DMs a high Spearman’s coefficient of rank 
correlation of 0.77 is obtained on average. These results strongly indicate that 
there is a high degree of convergent validity between both methods. 

Since both methods use completely different approaches, we can support the 
previous observation that AHP and CA provide valid models of the DM’s prefer- 
ences and thus have good predictive capabilities. 

Applicability of the methods: Additionally, it is important to examine factors 
that determine the applicability of the methods, because valid measurements are 
only possible if DMs are able and willing to apply the method in a serious manner. 
Hence, we consider factors in the four categories motivation, information content, 
cognitive burden and clearness. We find that AHP is at least slightly preferred for 
all indicators except for “degree of realism”. Mainly, the important motivation re- 
lated indicators show a clear advantage for AHP (cf Helm et al. 2002). 

A further indicator strongly favouring AHP is the time necessary for giving the 
preference statements. While AHP required 5.9 minutes on average, the DMs have 
to spend 9.2 minutes for CA. That means that AHP is able to gather more informa- 
tion in a given time span. However, it has to be recognized that the information 
content provided by CA per paired comparison is greater. Since the previous find- 





524 



ings on the validity support the assumption that AHP is at least on a par with CA, 
the latter seem not to derive advantage from this greater information content. 



4 Conclusion 

From our study, we derive that AHP is a good alternative to CA for measuring 
preferences in case of a relatively complex problem with more than the usual 
number of attributes and levels within standard CA studies. With respect to almost 
every measure AHP performs slightly better. In particular, AHP seem to have a 
better applicability with respect to motivation and time. 

Notwithstanding the similarity of total utilities computed by CA and AHP for 
complete alternatives, significant differences occur when considering partial utili- 
ties of single attributes. CA produces considerable contradictions with the conse- 
quence that dominated alternatives are preferred to dominating ones. 



References 

Addelman S (1962) Orthogonal main-effect plans for asymmetrical factorial experiments. 
Technometrics 4:21-46 

Green PE, Srinivasan V (1990) Conjoint analysis in marketing: New developments with 
implications for research and practice. Journal of Marketing 54 (October):3-19 

Green PE, Krieger AM, Wind YL (2001) Thirty years of conjoint analysis: Reflections and 
prospects. Interfaces 31/3:S56-S73 

Hausruckinger G, Herker A (1992) Die Konstruktion von Schatzdesigns fiir Conjoint- 
Analysen auf der Basis von Paarvergleichen. Marketing ZfP 14/2:99-1 10 

Helm R, Manthey L, Scholl A, Steiner M (2002) Empirical evaluation of preference elicita- 
tion techniques from marketing and decision analysis. Working Paper, FSU Jena 

Hensel-Bomer S (2000) Validitat computergestutzter hybrider Conjoint-Analysen. Gabler, 
Wiesbaden 

Mulye R (1998) An empirical comparison of three variants of the AHP and two variants of 
conjoint analysis. Journal of Behavioral Decision Making 1 1:263-280 

Saaty TL (1980) The analytical hierarchy process. McGraw-Hill, New York 

Tscheulin K (1991) Ein empirischer Vergleich der Eignung von Conjoint- Analyse und "A- 
nalytic Hierarchy Process" (AHP) zur Neuproduktplanung. Zeitschrift fur Betriebs- 
wirtschaft 61:1 267- 1 280 

Vargas LG (1990) An overview of the analytic hierarchy process and its applications. 
European Journal of Operational Research 48:2-8 

Wind Y, Saaty TL (1980) Marketing applications of the analytic hierarchy process. Man- 
agement Science 26:641-658 

Wittink DR, Vriens M, Burhenne W (1994) Commercial use of conjoint analysis in Europe: 
Results and critical reflections. International Journal of Research in Marketing 11:41- 
52 

Zahedi F (1986) The analytic hierarchy process - A survey of the method and its applica- 
tion. Interfaces 16:96-108 





Further Development of MADM-Approaches in 
China and in Germany 



Jutta Geldermann, Kejing Zhang, Otto Rentz 
University of Karlsruhe, 

{jutta.geldermann, kejing.zhang, otto.rentz}@wiwi.uni-karlsruhe.de 



Abstract: This paper seeks to give an overview and compare the main stream of thought in 
Multi Attribute Decision Making (MADM) theory and practice in China and Germany. 
MADM approaches are suitable for evaluation of a set of discrete alternatives. Widely ap- 
plied approaches, including Multi Attribute Utility Theory (MAUT), Analytic Hierarchy 
Process (AHP) and the outranking methods ELECTRE, PROMETHEE, TOPSIS, find ap- 
plications in the strategic production planning, site selection or technique assessment under 
economic, ecological and technical attributes. In China, MAUT and AHP are predomi- 
nantly applied, while in Germany outranking methods are more popular. As further devel- 
opment, the integration of Fuzzy theory and artificial intelligence in MADM are discussed, 
thus imprecise information can be considered. 



1 Introduction 

MADM methods are suited to the problem of evaluating a set of well-defined, dis- 
crete alternatives. With the developments of MADM, two different philosophies 
have been distinguished in the last few decades [9]: 

• The American school assumes, that the DMs have precise ideas about the utility 
of performance and weights. The approaches are called in general “classical 
MADM approaches”. MAUT and AHP belong to this group. These approaches 
are appropriate to incorporate a hierarchical structure of attributes. 

• The European philosophy supposes, that the DMs are not aware of the prefer- 
ences, so that they need decision support to structure the decision situation. The 
significant contribution of outranking methods is the introduction of incompa- 
rability and weak preference. ELECTRE and PROMETHEE are two important 
outranking methods. 

Starting point of MADM is a decision matrix, which is composed of alternatives 
and decision relevant attributes, by which the alternatives are compared. 

Along with the decision procedure, such fundamental issues of MADM as 

1. Problem structuring: definition of alternatives and structuring of attributes 

2. Determination of weights 

3. Consideration of imprecision 

4. Preference aggregation, recommendation 

5. MADM in the context of group decisions 

6. Sensitivity analysis 

are discussed in the following sections, relating to the research work. 




526 



2 Commonly Applied MADM Methods and Applications 
in China 

MAUT and AHP are two popular MADM techniques in China. Especially, AHP is 
most frequently analysed and applied. So far many suggestions for improvement 
have been discussed. 

In the literature, problem structuring received little attention. A lot of efforts 
have been made to derive preferences in the normalization. Liu and Qiu [15] stud- 
ied the theories on attributes. Six types of attributes, such as profit, cost, fixation, 
interval, deviation, and deviating interval attribute, with the generalised preference 
functions, have been presented.* 



2.1 The Treatment of Imprecision in MADM 

The imprecision of preference information is one fundamental problem of 
MADM. The imprecision can be represented in terms of interval numbers or fuzzy 
numbers or verbal inputs, which seem more natural and may better represent the 
reality. Consequently, Fuzzy theory is integrated into MADM. 

Fan and co-workers [8, 33] conducted much research on MADM problems with 
interval numbers. A linear programming method and a ranking approach with pos- 
sibilities have been proposed to solve MADM problems with uncertain intervals. 

A lot of research work has been done concerning AHP, especially regarding its 
consistency ratio and fuzzy extension. A new Extent Analysis Method for han- 
dling fuzzy AHP has been introduced, by applying the principle of the comparison 
of fuzzy triangular numbers [34]. Leung and Cao [14] introduced Sinarchy to ad- 
dress the earlier criticisms on AHP. It is identified that in AHP, tradeoffs between 
attributes vary amongst individual alternatives and are dependent on the alterna- 
tives’ proportion of contribution to each attribute. Sinarchy should be used for 
problems, where tradeoffs between attributes are in terms of their relative meas- 
urements. Sinarchy also can be used to prevent rank reversal. 

In Saaty's opinion, a consistency ratio (C.R.) of no more than 0.1 is acceptable. 
Leung and Cao [13] defined fuzzy consistency, based on tolerance deviation.^ Xu 
and Wei [31] proposed a method, by which the C.R. of the modified matrix is less 
than that of die original one. 

Besides the above mentioned approaches. Fuzzy integrated evaluation methods 
are widely applied in China. The notion of ideal-point, introduced by TOPSIS, has 
been used in Set Pair Analysis to deal with stability analysis under uncertainty. 



* Furthermore, for each type of attribute, a new method of qualitative-quantitative analysis 
in indices is proposed, based on different characters of the indices [24]. 

^ Fuzzy ratios of relative importance, allowing certain tolerance deviation, are formulated 
as constraints on the membership values of the local priorities. The fuzzy local and global 
weights are determined via the extension principle. The alternatives are ranked on the ba- 
sis of the global weights by applying a max-min set ranking method. 





527 



2.2 Incorporation of MADM in Group Decision Making 

MADM has been extended into group decision making situations. A great deal of 
research work has been done in this area. 

Lai et al. [12] applied the AHP technique to support the selection of a multi- 
media authorizing system in a group decision. The post-study survey suggested 
that AHP is more preferable than the Delphi technique. AHP is found to be more 
beneficial to consensus building in group decision settings. Xu [30] developed a 
theoretical basis for the application of the Weighted Geometric Mean Method in 
group decision making, by proving that the weighted geometric mean complex 
judgement matrix is of acceptable consistency. Wei et al. [25] proposed a min- 
max principle based procedure of preference adjustments to find the compromise 
weight in decision groups. 



2.3 Weighting Methods 

Attribute weights play an important role in measuring the overall preference val- 
ues of the alternatives in MADM models. Many distinct procedures for deriving 
weights have been proposed, such as trade-off, swing, direct-ratio and Saaty's ei- 
genvector approach. In China, there is a widely accepted weighting method, by 
which objective weight-set is derived through linear programming [8]. Ma et al. 
[17] proposed a subjective and objective integrated weighting method, by forming 
a two-objective optimisation model. 

Wei et al. [26] analysed the structure of the weight-set with parameter P while 
keeping the preference orders on alternatives, where P is a differential amount of 
value that makes one alternative preferable to the other. It is proved that weight- 
set can be written in the standard form of linear programming problems, and de- 
termined by a convex combination of the extreme points according to the corre- 
sponding intervals of p. 



2.4 New MADM Models 

One of the newly proposed methods is Superiority and Inferiority Ranking (SIR) 
[28], which is proved to be an extension of PROMETHEE. SIR has more choices 
at the aggregation step. When using Simple Additive Weighting (SAW) as the 
aggregation procedure, the model SIR* SAW coincides with PROMETHEE, while 
TOPSIS can be used to build a new model SIR TOPSIS. Further, the relationships 
between these methods have been explored. SIR appears to be not a single 
method, but a general MADM approach. 

Xu et al. [29] proposed a MADM procedure based on distances between partial 
preorders. Firstly, the DM ranks alternatives with a preorder for each attribute and 
provides complete/partial linear information about weights. Secondly, a distance 
procedure is used to aggregate the above individual rankings into a global ranking. 

Widely accepted, weights have to be determined in usual MADM. However, 
Du and Yu [4] proposed an approach, in which weight information is formulated 
in the form of facts and rules, without converting them into weights. Thus, ranking 
preference can be obtained by the reasoning of the knowledge. 





528 



The following table gives a brief overview of the applications of MADM in China. 
Table 1. Applications of MADM in China 


Type of problem 


Method 


Reference 


Blank selection in DFM 


Integrated Fuzzy approach 


[21] 


Evaluation of CIMS investment strategies 


Fuzzy approach 


[27] 


Software selection 


Group AHP 


[12] 


Oil-field evaluation 


Fuzzy AHP and Extend 
Analysis Method 


[34] 


Evaluation of institutes 


Linear Programming 


[8] 


Evaluation of environment, resources and 
sustainable development 


Building of hierarchical at- 
tributes 


[32] 



3 Commonly Applied MADM Methods and Applications 
in Germany 

Although classical methods, such as MAUT and AHP, are also applied, the 
outranking methods (ELECTRE, PROMETHEE) are much more popular in 
Germany than in China. MADM researchers in Germany have laid stress on: 

1. Application to real world problems, especially in environmental assessment. 

2. Comparative studies of different MADM methods. 



3.1 Theoretical Research 

In MADM, DMs are often assumed to be faced with a situation in which the well- 
defined alternatives already exist. However, in cases that the set of available op- 
eration possibilities are not known, DMs need to define appropriate alternatives. 
Dyckhoff and Ahn [5] proposed the principle of successive revelation of decision 
relevant alternatives, which is characterised by integration of generation and 
evaluation phases. 

Preference relations are defined as strict preference, weak preference, indiffer- 
ence and incomparability. Two preference relations may contradict each other. 
Esser [7] developed a framework to systemize these connections between prefer- 
ence relations, by defining and analysing when a preference relation is more com- 
plete, consistent and compatible in respect of another one. This conceptual frame- 
work can be applied to compare the preference relations. 

The wide variety of available techniques confuses the users. Distinct MADM 
techniques usually yield different results when applied to the same problem. A 
certain MADM method may appear to be appropriate for a particular case. 

Pudenz et al. [20] applied MADM methods to evaluate sustainability of man- 
agement strategies. MAUT, PROMETHEE, and AHP are presented and compared 
with other mathematical methods.^ 



^ Whereas the mathematical approach (Hasse diagram technique) is directed to the scien- 
tifically given data matrix and therefore yields an objective and transparent evaluation, 
MADM approaches have a higher participation degree by DMs in the decision process. 






529 



Eickemeier [6] studied different weighting methods in MADM and compared 
the nine-point-scale of Saaty's AHP, Barzilai/Lootsma's multiplicative AHP, and 
the exchange ratio in form of a membership value by Kaprcyk/Nurmi. 



3.2 MADM in Group Decision Making 

Chwolka and Raith [3] extended different group aggregation procedures applied in 
AHP to multiple-issue decision problems/ and developed a utilitarian weighted 
arithmetic mean method for preference aggregation. 

Geldermann et al. [9] studied incorporation of MADM into group decision, in 
which common procedures are proposed to aggregate group preference for 
MAUT, AHP and PROMETHEE. 



3.3 Fuzzy MADM 

German researchers have done much research work in the field of Fuzzy MADM. 
Rommelfanger and Eickemeier [23] extended fuzzy decision theory and integrated 
Fuzzy theory in MADM. Buckley et al. [2] proposed a method to find the fiizzy 
weights in fuzzy AHP. Geldermann and co-workers [11, 10] developed Fuzzy 
PROMETHEE, dealing with imprecise information formulated in trapezoidal 
fuzzy numbers. This approach is applied to environmental assessment of produc- 
tion techniques and determination of Best Available Techniques. 



3.4 New Research Area in MADM 

MADM has been used in environmental decision situations, which offer a new 
challenge for MADM in ethical aspects. Rauschmayer [22] argues that decision 
analysis has to reflect on its normative foimdation. As a prerequisite for a norma- 
tive argument of the DMs, the attribute has to reflect not only the interests but 
possibly also all values stemming from normative arguments of the DMs, which is 
especially important for environmental decision. The integration of values will re- 
sult in changes of the decision process and will not be possible without analytical 
capabilities of the decision analyst in ethics. 



^ It is demonstrated how existing procedures will generally fail to generate Pareto optimal 
agreements when applied to multiple users. The approach provides a theoretical basis for 
designing the AHP to implement social choice functions in practice. 





530 



The following table shows some selected applications of M ADM in Germany. 



Table 2 Applications of M ADM in Germany 



Type of problem 


Method 


Reference 


Software evaluation 


AHP 


[19] 


Synergy allocation within partner in a merger 


AHP5 


[18] 


Environmental assessment of production techniques 
in iron and steel industry. Determination of Best 


Fuzzy 

PROMETHEE 


[10], [11] 


Available Techniques 

Evaluation of sustainability of water management 


MAUT, AHP, 


[20] 


strategies 

Two-issue marketing negotiation 


PROMETHEE 
Group AHP 


ja 



4 Sensitivity Anaiysis 

Sensitivity analysis plays an essential role in MADM, it can provide the DMs in- 
sight in the decision. Till now, much research is dedicated to this subject. 

Wei et al. [26] proposed a method to determine the weight set, while keeping 
the preference orders on alternatives. The results are important for sensitivity 
analysis. Liu and Qiu [16] presented a sensitivity analysis method for TOPSIS and 
Double Base Points Ordering Method. 

Geldermann and co-workers [11, 9] developed a graphical sensitivity analysis 
instrument, delivering insensitivity intervals, and a systematic sensitivity analysis 
method, by which a sensitivity degree of each attribute can be derived. 



5 Conclusions 

The literature on theory and applications of MADM in both countries has been re- 
viewed. This study reflects the recent contributions made by MADM researchers 
in both countries. Fundamental issues such as MADM models, weighting meth- 
ods, attribute generation, preference elicitation and integration of fuzzy theory are 
presented. Each country has its own emphases on the MADM research. 

Taking into account of the recent developments of problem structuring, future 
interdisciplinary research should be dedicated to cultural attitudes towards the in- 
terpersonal process of decision making- especially in the light of globalisation. [1] 



^ Ossadnik and Lange [19] applied AHP to evaluate AHP Software products -AutoMan, 
ECPro and HIPRE 3+. The results delivered from the Software were employed to gener- 
ate pairwise comparisons in AHP model. The relevant attributes are derived from 
ISO/IEC 9126. 






531 



6 References 

[1] Belton V, Stewart TJ (2001) Multiple criteria decision analysis: An integrated ap- 

proach. Kluwer Academic Publishers, London 

[2] Buckley JJ, Feuring T, Hayashi Y (2001) Fuzzy hierarchical analysis revisited. Euro- 

pean Journal of Operational Research 129:48-64 

[3] Chwolka A, Raith MG (2001) Group preference aggregation with the AHP - Implica- 

tions for multiple-issue agendas. European Journal of Operational Research 
132:176-186 

[4] Du XM, Yu YL (1999) Intelligent multi-attribute decision making. Journal of Arma- 

ment 20:90-93 

[5] Dyckhoff H, Ahn H (1998) Integrierte Altemativen-Generierung und -Bewertung. Die 

Betriebswirtschaft 58:49-63 

[6] Eickemeier S (2001) Bestimmung der Gewichte bei Mehrzielentscheidungen. Eine 

vergleichende Analyse ausgewahlter Verfahren. In: Chamoni P, Leisten R, Mar- 
tin A, Minnemann J, Stadtler H (Hrsg) Operations Research Proceedings, Sprin- 
ger- Verlag, Berlin, 389-396 

[7] Esser J (2001) Vollst^digkeit, Konsistenz und Kompatibilitat von Praferenzrelatio- 

nen. OR Spectrum 23:183-201 

[8] Fan ZP, Zhang Q (1998) A linear programming method for uncertain multiple attrib- 

ute decision making. Journal of North-Eastern University 19:419-421 

[9] Geldermann J, Zhang KJ, Rentz O (2002) Entwicklung eines integrierten multikrite- 

riellen Gruppenentscheidungsunterstiitzungssystems (MGDSS). In: Fichtner W, 
Geldermann J (Hrsg) Einsatz von OR- Verfahren zur Techno-okonomischen Ana- 
lyse von Produktionssystemen, Verlag Peter Lang, Frankfurt, 169-186 

[10] Geldermann J, Spengler T, Rentz O (2000) Fuzzy outranking for environmental as- 

sessment case study: Iron and steel making industry. Fuzzy Sets and Systems 
115:45-65 

[11] Geldermann J, Rentz O (2001) Integrierte Technikbewertung bei unvollstandigen In- 

formationen als Unterstiitzung fiir die Bestimmung von Besten Verfugbaren 
Techniken (BVT). OR Spectrum 23:137-157 

[12] Lai VS, Wong BK, Cheung W (2002) Group decision making in a multiple attribute 

environment: A case using the AHP in software selection. European Journal of 
Operational Research 137:1 34- 1 44 

[13] Leung LC, Cao D (2000) On consistency and ranking of alternatives in fuzzy AHP. 

European Journal of Operational Research 142:102-133 

[14] Leung LC, Cao D (2001) On the efficacy of modelling multi-attribute decision prob- 

lems using AHP and Sinarchy. European Journal of Operational Research 
132:39-49 

[15] Liu SL, Qiu YH (1998) Studies on the basic theories for MCDM. Theory and Practice 

of System Engineering 1:38-43 

[16] Liu SL, Qiu YH (1998) Generalization for the double base points ordering method for 

MADM. Theory and Practice of System Engineering 2:23-25 

[17] Ma J, Fan ZP, Huang LH (1998) A subjective and objective integrated approach to 

determine attribute weights. European Journal of Operational Research 1 12:397- 
404 

[18] Ossadnik W (1996) AHP-based synergy allocation to the partners in a merger. Euro- 

pean Journal of Operational Research 88:42-49 





532 



[19] Ossadnik W, Lange O (1999) AHP-based evaluation of AHP-software. European 

Journal Of Operational Research 1 18:578-588 

[20] Pudenz S, Briiggemann R, Voigt K, Welzl G (2002) Nachhaltige Entwicklung von 

Managementstrategien - Multikriterielle Bewertungs- und Entscheidungshilfe- 
Instrumente. Umweltwissenschaften und Schadstoff-Forschung 14(1) 52-57 

[21] Qiao LH, Ma T, Wang SC (1999) An approach to multiattribute evaluation in manu- 

facturability analysis for blank selection. Chinese Journal of Mechanical Engi- 
neering 35(4):42-46 

[22] Rauschmayer F (2001) Reflections on ethics and MCA in environmental decisions. 

Journal of Multi-Criteria Decision Analysis 10(2):65-74 

[23] Rommelfanger HJ, Eickemeier SH (2002) Entscheidungstheorie - Klassische Kon- 

zepte und Fuzzy Erweiterungen. Springer- Verlag, Berlin 

[24] Wang P, Wang LL, Xian H, Luo BY (1999) Study of qualitative quantitative normali- 

zation of targets in multi-objective evaluation. Journal of Wuhan Automotive 
Polytechnic University 21(6): 37-40 

[25] Wei QL, Han H, Ma J, Fan ZP (2000) A compromise weight for multi-attribute group 

decision making with individual preference. Journal of the Operational Research 
Society 51:625-634 

[26] Wei QL, Ma J, Fan ZP (2000) A parameter analysis method for the weight-set to sat- 

isfy preference orders of alternatives in additive multi-attribute value models. 
Journal of Multi-Criteria Decision Analysis 9(5): 181-190 

[27] Wu YH, Fu YJ, Zhou JR (1998) Research on Fuzzy multiobjective method for the 

evaluation of CIMS. Theory and Practice of System Engineering 9:55-60 

[28] Xu XZ (2001) The SIR method: A superiority and inferiority ranking method for mul- 

tiple attribute decision making. European Journal of Operational Research 
131:587-602 

[29] Xu XZ, Martel JM, Lamond BF (2001) A multiple criteria ranking procedure based 

on distance between partial preorders. European Journal of Operational Research 
133:69-80 

[30] Xu Z (2000) On consistency of the weighted geometric mean complex judgement ma- 

trix in AHP. European Journal of Operational Research 126:83-687 

[31] Xu ZS, Wei CP (1999) A consistency improving method in the analytic hierarchy 

process. European Journal of Operational Research 1 16:443-449 

[32] Yue CY, Li H (1998) Multi-attribute evaluation for environment, resource and sus- 

tainable development. Monitoring and Assessment 12:28-30 

[33] Zhang F, Fan ZP, Pan DH (1999) A ranking approach with possibilities for multi at- 

tribute decision making problems with intervals. Control and Decision 
14(6):703-706 

[34] Zhu KJ, Jing Y, Chang DY(1999) A discussion on Extent Analysis Method and appli- 

cations of fuzzy AHP. European Journal of Operational Research 1 16:450-456 





Wissensrevision in einer MaxEnt/ 
MinREnt-Umgebung 



Elmar Reucher und Wilhelm Rodder 
FernUniversitat in Hagen, 

Lehrstuhl fiir Betriebswirtschaftslehre, insb. Operations Research 
elmar . reucher ©fernuni-hagen . de , wilhelm , r oedder ® fer nuni-hagen . de 



Zusammenfassung Die Abbildung menschlichen Wissens und seine maschinelle 
Verarbeitung sind ein wichtiger Forschungsgegenstand auf dem Gebiet der Kiinstli- 
chen Intelligenz (KI). Im allgemeinen unterliegt das Wissen zeitlichen Veranderun- 
gen, so dass zu einem spateren Zeitpunkt nach Erfahren neuer Sachverhalte bisher 
bekannte nicht mehr giiltig sind und das “alte“ Wissen revidiert werden muss. 
Zur Formulierung von Sachverhalten in einer Domane werden in diesem Aufsatz 
probabilistische Konditionale verwendet, mit denen eine Sprache zur Kommunika- 
tion zwischen Mensch und Maschine zur Verfiigung steht, die eng der menschlichen 
Denkweise angelegt ist. Das auf diese Weise formulierte Wissen wird dabei infor- 
mationstreu durch das Entropieprinzip verarbeitet. In dem vorliegenden Aufsatz 
wird gezeigt, wie der Wissensrevisionsprozess in einer entropieoptimalen Umgebung 
durchgefuhrt wird. Anhand eines okonomischen Beispiels wird der Wissensrevisi- 
onsprozess naher illustriert; samtliche hierfiir erforder lichen Berechnungen werden 
in der Expertensystem-Shell SPIRIT vollzogen. 

1 Einfiihrung 

Die Grundidee des Wissenserwerbs in dem vorliegenden Beitrag liegt im 
Aufbau einer Verteilung maximaler Entropie (MaxEnt- Verteilung) auf dem 
Wertebereich einer endlichen Menge endlichwertiger Variabler, bei Respek- 
tierung vorgegebener probabilistischer Konditionale. Nun ist Wissenserwerb 
keine einmalige Angelegenheit, sondern unterliegt im Laufe der Zeit durch 
das Bekanntwerden immer neueren Wissens einer dynamischen Veranderung. 
Gelegentlich ist neueres Wissen mit alterem kompatibel, gelegentlich nicht. 
In der Literatur findet man dazu verschiedene Adaptionsformen, die unter 
dem Begriff Wissensrevision zusammengefasst sind [3]. In dem vorliegenden 
Beitrag liegt der Fokus auf der Abbildung einer speziellen Form der Wissens- 
revision: Neu erworbene Sachverhalte sollen unter dem aktuellen Wissens- 
stand giiltig sein und die alten Sachverhalte dabei “so wenig wie moglich“ 
revidiert werden. 

2 Grundlagen 

2.1 Probabilistische Konditionale 

Auf dem Ereignisfeld iiber einer Variablenmenge V = {T^i, Fn}? wobei jede 
Variable Vi eine endliche Anzahl diskreter Auspragungen Vi besitzt, sei eine 




534 



Verteilung P definiert. p bezeichne dabei die Wahrscheinlichkeitsfunktion mit 
P(V = v) = p(v) mit Vollkonjunktion v = Fasst man die Varia- 

blenrealisationen als Literale auf, so lasst sich eine Sprache C bestehend aus 
syntaktischen Ausdriicken formulieren, indem die Literale durch Disjunktion 
V , Negation -i und Konjunktion A miteinander verkniipft werden. Solche 
aussagelogischen Formeln werden als Fakten bezeichnet und die Elemente 
aus C werden im folgenden durch Grofibuchstaben symbolisiert. 

Probabilistische Sachverhalte sind nun Ausdriicke der Form B|A[x], wobei 
das Konditional B\A entsprechend der dreiwertigen Logik von Calabrese 
wohldefiniert ist. Der Ausdruck wenn A dann 5, B\A{v) ist wahr, falls das 
Faktum BA unter v wahr ist, falsch fiir BA unter v wahr und nicht definiert 
fiir B unter [1]. a: steht fiir die Wahrscheinlichkeit, dass in einer Domane 
das Konditional gilt. 

2.2 Das Entropiepr inzip 

Wissen in Form einer Menge von Sachverhalten TZ = {Bi\Ai [xi], i = 1, /} 
wird in diesem Beitrag entropieoptimal durch Losen der Aufgabe 

P* = axgmaxP(Q) = - ^ 9 (v) • logg(v) (1) 

V 

u. d. N.: QerfiilltT^ (Q \= Tl) 

verarbeitet [6]. Das Prinzip Maximaler Entropie haben verschiedene Auto- 
ren unabhangig voneinander axiomatisch fundiert; man vergleiche hierzu [2], 
[8]. Die auf diese Weise erzeugte Verteilung P* ist gerade jene unter alien 
Verteilungen Q, in der alle in 1Z explizit formulierten - und keine weiteren - 
Abhangigkeiten reprasentiert sind. Die in (1) formulierte Optimierungsaufga- 
be garantiert somit eine informationstreue Wissensverarbeitung. Liegen nun 
situativ giiltige Sachverhalte TZe vor, so wird das bekannte Wissen aus P* 
durch Losen der Aufgabe 

P** = argmin R{Q, P*) = g(v) • log(^) (2) 

u. d. N.: Q 1= TZe 

informationstreu an das neue Wissen aus TZe adaptiert. Man vergleiche hierzu 
auch [6]. 

3 Wissensrevision 

3.1 Formen der Wissensrevision 

Gegeben sei eine Wahrscheinlichkeitsverteilung P auf dem Grundraum V, 
in der bekannte Sachverhalte TZ giiltig sind. Werden neue Sachverhalte TZ 





535 



bekannt, so erfahrt das Wissen aus P i. a. eine Veranderung. In der Literatur 
unterscheidet man hier zwischen verschiedenen Formen der Wissensrevision 
[3], 

• Wissensfokussierung (Focusing) 

Hier erfolgt die Adaption des schon bekannten Wissens P an neue situa- 
tive Sachverhalte TZe- Diese Form der Wissensrevision ist direkt in einer 
entropieoptimalen Umgebung durch Losen der Aufgabe (2) moglich. 

• Wissenserweiterung (Expansion) 

Neue Sachverhalte P erganzen schon bekannte in P; alle Sachverhalte 
inP :=PUP sind unter dem neuen Wissen giiltig. Die Wissenserweite- 
rung erfolgt entropieoptimal durch Losen der Aufgabe (1). Man vergleiche 
hierzu [4], [5] (Seite lOOff.). 

• Wissensabgleich (Updating) 

Angenommen, zwei Experten besitzen unterschiedliches Wissen iiber die 
gleiche Domane, so dass die formulierten Sachverhalte P des einen Exper- 
ten und die Sachverhalte P des anderen nicht konsistent zueinander sind. 
Dann existiert keine Verteilung F, die F U F erfiillt. Eine Moglichkeit 
zur Repr^entation beider Wissensteile besteht nun darin, das Experten- 
wissen so aneinander anzupassen, dass leicht modifizierte Sachverhalte P 
und P' , also auch PUP' gelten. In ihrer urspriinglichen Form sind sie 
dann i.a. nicht mehr erfiillt. Rodder und Xu geben in [7] verschiedene 
Methoden an, wie diese Form der Wissensrevision in einer entropieopti- 
malen Umgebung durchgefiihrt werden kann. 

• Wissensrevision im engeren Sinne (Revision) 

Die alten Sachverhalte P sind mit neuen Sachverhalten P ebenfalls nicht 
kompatibel. Aber anders als beim Wissensabgleich wird bei dieser Form 
der Wissensrevision gefordert, dass die neuen Sachverhalte P unter dem 
neuen Wissen giiltig sind. Dies entspricht dem beim menschlichen Lern- 
prozess haufigen Fall, dass man neue Sachverhalte erfahrt und friihere 
Uberzeugungen teilweise aufgeben muss; vgl. hierzu das in Abschnitt 4 
vorgestellte Beispiel. 

Wie die Wissensrevision im engeren Sinne in einer entropieoptimalen Umge- 
bung durchgefiihrt werden kann, wird im folgenden Abschnitt beschrieben. 



3.2 Wissensrevision in einer MaxEnt-Umgebung 

Wir betrachten einen Wissensrevisionsprozess, der sich zu aufeinander fol- 
genden Zeitpunkten t — 1,...,T vollzieht, wobei zu jedem Zeitpunkt t neue 
Sachverhalte Pt = {Bj\A\[x\], i = 1,...,/^} bekannt werden. Ziel ist es, die 
einzelnen Stufen der Wissensrevision in einer entropieoptimalen Umgebung 





536 



so abzubilden, dass zum aktuellen Zeitpunkt t die Sachverhalte TZt giiltig 
sind - in einem noch zu spezifizierenden Sinn wobei die zu friiheren Zeit- 
punkten giiltigen Sachverhalte so wenig wie moglich revidiert 

werden. Es bezeichnen W und Wt,t = binare (Welt-) variable und 

TZt\Wt := {Bj\A\Wt [x\]i = 1, sind gleichsam unter solchen Weltvaria- 
blen konditionierte Sachverhalte. Dann l^st sich ein T— stufiger Wissensre- 
visionsprozess wie folgt durchfiihren: 

In t = 1 werden die Sachverhalte TZi durch Losen der Aufgabe eingelernt: 

= argmaxif((5) (3) 

u. d. N. : g 

Die Losung (3) entspricht genau der in (1), wird jedoch hier der Vollstandig- 
keit halber wiederholt. 

In t = 2 wird die Aufgabe 

= argmaxi?(g) (4) 

u. d. N. : g|=P2UPi|WiU{Wi|W[xr^]}U{W[1.0]}. 

gelost. Mit der Weltvariablen W wird iiber TZi\Wi mittels des Konditionals 
WilW erreicht, dass die Sachverhalte aus Tli revidiert werden und die 

Sachverhalte P 2 unter dem Wissen giiltig sind. Wahlt man dabei das 
Maximum fiir die Wahrscheinlichkeit, mit der die Welt W\ wahr sein 
kann, so wird die Philosophie einer gleichmafiig iiber alle Sachverhalte aus 
Til geringstmoglichen Wissensrevision verfolgt. In dem Beispiel in Abschnitt 
4 werden die Zusammenhange noch naher verdeutlicht. 

Einschub: Der Wahrscheinlichkeit x^^^^ kommt dabei folgende Bedeutung 
zu. 

• = 0. Die Sachverhalte Tii werden vollstandig aufgegeben. 

• 0 < < 1. Es existiert ein echtes Maximum fiir die Wahrscheinlich- 

keit, mit der die Welt W\ unter W und somit die Sachverhalte aus Hi 
wahr sind. Dabei gilt, je grofier desto weniger werden die Sachver- 

halte aus Hi revidiert. 

• = 1. Einige Sachverhalte aus Hi sind unter dem neuen Wissen P^ 
substantiell nicht mehr enthalten; es liegt dann der Fall bedingter oder 
schwacher Inkonsistenz vor. Dieser Fall wird ausfiihrlich in [7] behandelt. 

Berechne nun die veranderten Wahrscheinlichkeiten, mit denen die Sachver- 
halte Hi unter dem Wissen P^ gelten durch P^{Bl\Al) fiir i = 1, ..., /i. Dann 
bezeichnet Hi := {B}\Al [P^Bl\Al)]i = die Sachverhalte in Hu 

jedoch mit gemafi (4) modifizierten Wahrscheinlichkeiten. 

In ^ = 3 lost man schliefilich 

P^ = argmaxP'(g) (5) 

u. d. N. : ghP3U(P2UPi)|W2U{W2|W[a:ri}U{W[1.0]}. 

Die Revision erfolgt nun iterativ analog zu (5) bis zum Zeithorizont T. 





537 



4 Ein Beispiel 

In einer (fiktiven) statistischen Erhebung fiir die Automobilbranche wurde 
untersucht, inwieweit Fahrzeugmerkmale wie Farbe, Comfort^ Extras und der 
Preis den Kauf fiir ein Fahrzeug beeinflussen. Die Ergebnisse dieser Unter- 





. .. r * V I 


00,90000 


Farbe I Fahtzeug i 


1 1,00000 


Verbrauch | Fahrzeug 


210,50000 


Comfort I Fahrzeug | 


310,10000 ' 


E)dras] Fahrzeug j 


4h.ooood 


Prets I Fahrzeug I 



Abbildungl. Sachverhalte IZi 



suchung zeigt Abbildung 1. So htogt die Kaufentscheidung fiir alle Fahr- 
zeuge (100%) vom Preis und dem Verbrauch ab, bei fast alien Fahrzeugen 
(90%) ist die Farbe^ bei der Halfte der Fahrzeuge der Comfort, und bei 10% 
aller Fahrzeuge sind die Extras fiir die Kaufentscheidung relevant. Da die 







60,99000 


Farhe I Luiouskiasee 


?1 0.05000 


Preis 1 Luxushfasse 


6 0,01000 


Farhe | Nutzfahrzeuge 


220,98000 


Preis 1 NiMahtzeuge 


71,00000 


Farhe | Sportwagen 


23 0,05000 


Preis 1 Sportwagen 


80,70000 


Farbe | K^einwagen 


24 1,00000 


Preis 1 Klein wagen 


9|D.d5000 


Verbrauch | LuooJskjasse 


2511,00000 


Fahrzeug | Luxusklasse 


lopjoooo 


Verbrauch | Nutzfahrzeuge 


2611,00000 


Fahrzeug | Nutzfahizeupe 


1 110,01000 


Verbrauch | Sportwagen 


271,00000 


Fahrzeug | Sportwagen 


121,00000 


Verbrauch | Klelnwagen 


28 


1,00000 


Fahrzeug | Kfeinwagen 


13k99000 


Comfort lUujcusKlasse 


29 


1,00000 


Fahrzeug | LKW 


imiddoo 


Comfort 1 Nutzfahrzeuge 


30 


1,00000 


Luxusklasse | Ferrari 


iskosoob 1 


Comlbrt 1 Spor^gen 


31 


1 ,00000 


Farbe | Ferrari 


! iejo.ibbob 1 


Comfbrt 1 klelnwagen 


32 


0,00000 


Farbe 1 LKW 


l'7jb, 99000 


Extras | Luxusklasse 


33 


1 ,00000 


Verbrauch | LHW 


180,90000 


Extras 1 Nutzfahrzeuge 


34 


0,90000 


Comfort 1 LKW 


1Q|o, 80000 


Extras | Sportwagen 


35 


o.osooo 


Extras 1 LKW 


20|0,01000 


1 Klelhwggen 


36" 


b, 90000 


Preis 1 LKW 



Abbildung2. Sachverhalte 7^2 



hieraus gewonnenen Informationen noch sehr allgemein sind, erhofft man 
durch nahere Spezifizierung der Fahrzeugklassen weitere Erkenntnisse iiber 
das Kauferverhalten zu gewinnen und gibt dazu eine zweite Erhebung in Auf- 
trag. Die Ergebnisse sind in Abbildung 2 zur Regelmenge Tl 2 zusammenge- 
fasst. Der Wissensrevisionsprozess erfolgt nun durch Losung der Aufgabe (4) 
mit = 0.99999, was zugleich der maximal zulassige Wert ist. Unter dem 
neuen Wissensstand gelten jetzt die Sachverhalte TZ 2 , wohingegen die in 
TZi revidiert werden miissen, man vergleiche Tabelle 1. Schien der Preis nach 
der ersten Erhebung noch bei alien Fahrzeugen fiir die Kaufentscheidung rele- 
vant, so ist diese Aussage jetzt nur noch fiir fast alle Fahrzeuge richtig (98%), 






538 



Tabellel. Sachverhalte und revidierte Sachverhalte 



1 


p2 


1 Farbe\Fahrzeug 


0.90 


0.87 


2 Verbrauch\Fahrzeug 


1.00 


0.98 


3 Comfort] Fahrzeug 


0.50 


0.49 


4 Extras] Fahrzeug 


0.10 


0.12 


5 Preis]Fahrzeug 


1.00 


0.98 



womit die friihere sichere Uberzeugung aufgegeben wurde. Entsprechend le- 
sen sich die ubrigen Ergebnisse in Tabelle 1. Nach Revision nehmen iibrigens 
nicht stets die Wahrscheinlichkeiten fur die Giiltigkeit der Sachverhalte ab, 
man vergleiche hierzu Regel 4. Samtliche Berechnungen wurden dabei mit 
der Expertensystem-Shell SPIRIT durchgefiihrt [9]. 

5 Fazit 

In diesem Artikel wurde gezeigt, wie Wissen bei Erfahren neuer Sachverhalte, 
die im Widerspruch zu schon bekannten stehen, revidiert werden kann. Das 
Ergebnis ist eine die aktuellen Sachverhalte reprasentierende Wahrschein- 
lichkeitsverteilung, wobei das friiher giiltige Wissen so wenig wie moglich 
revidiert wird. Insbesondere lassen sich so auch sichere Uberzeugungen auf- 
geben, was die Qualitat des hier vorgestellten Wissensrevisionsprozesses in 
einer entropieoptimalen Umgebung eindrucksvoll unterstreicht. 

Literatur 

1. Calabrese, P. M. (1991) Deduction and Inference Using Conditional Logic and 
Probability, Conditional Logic in Expert Systems.!. R. Goodman, M. M. Gupta, 
H. T. Nguyen, G. S. Rogers (editors), Elsevier Science Publishers B. V. 

2. Kern-Isberner, G. (1998) Characterizing the principle of minimum cross-entropy 
within a conditional-logical framework. Artificial Intelligence, Vol. 98, 169-208. 

3. Kern-Isberner, G. (2001) Conditionals in Nonmonotonic Reasoning and Belief 
Revision. Springer Berlin Heidelberg 

4. Reucher, E., Rodder, W. (2001) Relevanz von Information in konditionalen Ent- 
scheidungsmodellen. OR Proceedings, Springer Berlin Heidelberg 379-386 

5. Reucher, E. (2002) Modellbildung bei Unsicherheit und Ungewissheit in kondi- 
tionalen Strukturen. Dissertation am Fachbereich Wirtschaftswissenschaft der 
FernUniversitat in Hagen 

6. R5dder, W. (2000) Conditional Logic and the Principle of Entropie. Artificial 
Intelligence, 117, 83-106 

7. Rodder, W., Xu, L. (2000) Behebung von Inkonsistenzen in der probabilistischen 
Expertensystem-Shell SPIRIT OR Proceedings, Springer Berlin Heidelberg 260- 
265 

8. Shore, J.-E., Johnson R.-W. (1980) Axiomatic Derivation of the Principle of 
Maximum Entropy and the Principle of Minimum Cross- Entropy. IEEE Transact 
Inf Theory IT-26(1): 26-37 

9. SPIRIT, http://www.fernuni-hagen.de/BWLOR/spirithome.html 





Innovation, Operations Research & Decision 
Support in the Miiitary 



Heiner Micko 

National Defense Academy 
A - 1070 Vienna 

Operations Research as a scientific discipline initially started some decades ago 
within the Military, focussing that time on logistics and questions about effective 
distribution of resources. 

Nowadays Operations Research has won a strong role within nearly all fields of 
innovative economy, and has also gained a certain acceptance at different depart- 
ments of the governments. 

Military Applications of Operations Research recently include as well all topics 
of governmental Decision Support in peacetime as some more operational tasks of 
Decision Support in wartime and War gaming Applications, the latter often being 
called - and mixed up with - Cyber war. 

Main purpose of Operations Research in the Military can be seen as a knowl- 
edge base for decisions, innovation and change management, promoting and help- 
ing decisions about organizational change. 

This paper tries to explain the theoretical role of decision support within an in- 
novation context, and to give some possible practical options for operations re- 
search in the Military. 



1. The Science of National Economy and Early Operations Research 

During 19* and early 20* century many if not all Armed Forces tried to get sci- 
entific methods into the military planning process ^ thus defining tasks and interre- 
lationship of politics, strategy, and what was called the operational art. 

The emergence of capitalistic and Marxist theories about global and national 
economies as well as a lot of questions arising from the world economic crisis in 
1920/1930 gave strong input to general explanation efforts and approaches^. 

Logistic calculus became commonly used in the daily life of business manage- 
ment^, in order to make optimized use of limited resources. In cases of more com- 
plex decisions mathematicians and engineers were called upon together. Opera- 
tional research was done explicitly for the first time in the military^. 

Further on the concept of total war and the concentration of all national re- 
sources to the main goal of winning second world war brought closely together all 
questions about strategic disposition from the military and national economy. 



^ see General Theories “On War” by CLAUSEWITZ 1832 and his perception till nowadays 
^ see Economic Theories by TAYLOR 191 1 and KEYNES 1936 
^ see Fordism, Industry and Technical Intelligence by OTTILIENFELD 1926 
^ see Equations of Combat by LANCHESTER 1916, and see Comparative Studies of World 
War Casualties by GILCHRIST 1928 




540 



Not only the that time global players Germany and America but also the Euro- 
pean defense analysis community as a whole were engaged with mathematical 
models in economic and military logistics on strategic and operational levels. 

After second world war military applications of mathematical models were ex- 
tended extensively to all fields of defense, including primarily all Cold War intel- 
ligence results and “what if scenarios”. Such models sought to compare existing 
weaponry on both sides of the iron curtain, and led to the development of strategic 
arms control treaties and concluding verification efforts. 

Due to scientific progress, esp. in the computer science, the Military during the 
1960ies went into euphorical expectations concerning artificial intelligence, war 
gaming and Operations Reseach. Such overestimation came down to a more real- 
istic attitude soon, but led to frequent use of expert systems. Especially the realiza- 
tion of cybernetic measuring and controls into early autonomous agents or robots 
showed the complexity of the man-nature-system. On the one hand the rise of Sys- 
tems Thinking^ on the other hand fuzzy logic and fractals to describe chaotic con- 
ditions gave mathematical attempts at hand to deal with the recognition of limited 
predictability as well as limited resources^ worldwide. 



2. Knowledge Based Society, Innovation and Change Management 

Economic Cycles^ caused “basic innovations” in a 30 to 60 years sequence, 
such as steam machine and railroads, electricity, chemical synthesis of organic 
materials like aniline and rubber, combustion engine and motor car, transistor and 
computer, nuclear fission and atomic bomb or lately genetic engineering. 

The development of „incremental innovations“out of the basic innovations can 
be recognized at early stages when published or patents pending (invention), or 
later on when they are marketable (innovation). The novelty of any invention is 
stated legally at the patent application procedure, but de facto economy is follow- 
ing different strategies to dominate the market - therefore competition on a spe- 
cific field of technology often causes the secrecy of significant details without a 
patent even for a long time after successful introduction to the market. 

There is a difference between objective and subjective innovation. Who inno- 
vates objectively is using a product or process uniquely for the first time world 
wide. To innovate subjectively is the case, when individuals, enterprises or re- 
gional entities such nation states pick up an idea as a possibility shown earlier by 
an objective innovation: The subjects in that sense are using objective, already ex- 
isting and somewhere realized technical knowledge and abilities. 

Research about innovations looks closely at the diffusion process of each inno- 
vation, also known as transfer of technology. The diffusion of an innovation over 
space and time follows usually the typical, well known growth-patterns. 

Some preconditions are needed to implement an innovation on a specific market: 



^ see General Systems Theory by BERTALANFFY 1968, and see The Transfer of Systems 
Thinking from the Pentagton to the Great Society by JARDINI, in: HUGHES 2000 
^ see Limits to Growth by MEADOWS, MEADOWS et al. 1972 
^ see Theory of Economic Cycles by KONDRATIEFF 1926, SCHUMPETER 1935 





541 



Table 1. Innovation Preconditions (MICKO 2002) 

Precondition: functions concerning the innovation process: 

feasibility technology push 

affordability market pull, economic demand 

entrepreneurship individual initiative of an innovator 

resources - physical + political infrastructure 

- material + capital 

- personnel + human capital 

capability know-how transfer 

Whenever the preconditions are given the innovation diffusion will be as follows: 

Table 2. Innovation Diffusion (see BRASCHE 1989)^ 

User segments, varying along time in the innovation diffusion process 
a - first user, innovator 
b - early users 
c - early majority 

d - „take off ‘ of this innovation at this market 
e - late majority 
f - stragglers 

g - market saturation 

h - non-user segment, rejecters of an innovation 

The non-user segment is important in cases of rejected or in a specific market for- 
bidden technologies, e.g. nuclear fission, genetic engineered food, or some inter- 
national contracted obligations to non-proliferation of defense technologies. 

Fig. 1. Innovation between Research, Education, Politics and Economy (MICKO 2002) 





Realization of technical feasibility is an early phase of the innovation process, 
where the initiative of an innovator meets new knowledge from research, and with 
political consent a strategic donation of resources (means for development) is al- 
located to support the innovation and realize first prototypes. 




OBJECTIVES 
politics + strategy 



feasibility ? * 

research + 
KNOWLEDGE 



affordability ? 



REALIZATION 
economy + operations 

capability ? 

+ experience 
ABILITY 



* see Qualification, Innovation Processes and Diffusion by BRASCHE 1989 





542 



Realization of market chances and affordability is possible only after reali- 
zation of the feasibility, and depends on first results of development. The influ- 
ence of politico-strategic objectives at this stage is getting less in favor of the eco- 
nomical and operational adoption process within the innovator’s enterprise. 

Implementation of the production capability is possible only after realization 
of the market chances and the affordability, and thus depends on further results of 
development. Critical for success is the development of organization, and mainly 
of the human capital („organizational learning**), within the innovator’s enterprise. 

The last step is deemed as the utmost challenge for every today’s enterprise, to 
meet the „knowledge based society** and enable a good „change management**. 
Notwithstanding that civil service in general tends not to be innovative or adaptive 
and not to follow best practice from industry, it’s obvious to state the same re- 
quirements to a change management for the public, and also for armed forces: Or- 
ganizational learning always combines proven with the newly recognized. 

Whenever an organization is going to be changed, there is a need to have a look 
at the state of the art, at science and technology, at the practice, or at least at flops. 
Using terms of the Military, an estimate of situation needs not only a task, own 
and enemy troops, and the environment, but especially to include a sound estimate 
of the options given to act or to react: High Commander’s decisions need support. 

Scientific Support certainly will not be given to low or intermediate echelons, 
but to high or top levels only. Thus we state, that all resources given to scientific 
research and development in an enterprise are taken by the strategic leadership. 
Sure the result depends mainly on the efficiency of operational implementation, 
but the responsibility for science in any economic process always is on boss level. 

Fig. 2. Development of Organization as Center of Innovation Processes (MICKO 2002) 



OBJECTIVES affordability ? REALIZATION 

politics + strategy ^ ^ economy + operations 




know - how - transfer 

research [technology development ^organization +human€apital}^ 

product- or process-j 
transfer 




education 



The initiative for the transition from research (1) to technical development (2) 
during initial innovative phases („feasibility, affordability, capability**) always is 
strategic responsibility of the top management. Further the strategic influence 
to innovation processes consists mainly of strategic controlling (3) and periodical 
re-assessment of the central question (4): The market chances („affordability**). 

Operational fields of the enterprise have to deal with the innovation process in 
the first run when contributing to economical assessment of planned initiatives (4). 
The operationalization of technical developments during early phases of the in- 





543 



novation process (5) has to be done mainly through development of the organiza- 
tion (6). To implement new structural and/or workflow organization and to 
build an efficient and finally effective innovation, the decisive point is more than 
anything else development of human capital (7), starting as early as possible in 
close conjunction with late phases of research and technical development. 

The aspects of know-how-transfer from technical development to develop- 
ment of the human capital, and the product-transfer fi-om procurement to devel- 
opment of new structural organization are the broadly well aware components of 
organizational change. Less superficial, but nevertheless important is the devel- 
opment of processes, the development of new workflow organization, the transfer 
of procedural knowledge. 

An additional important aspect, which represents an optional, suitable starting 
point for Operations Research in the innovation process, is the necessary mesh- 
ing together of procedural organization with the experience present in the enter- 
prise. Difficult to measure and thus difficult to promote, too, is the heavily dis- 
cussed „tacit knowledge‘s, which is the non-codified technological know-how in 
all enterprises. It is very much essential and crucial to all innovation processes. 
Quality management, performance evaluation, and other instruments of operative 
controlling also undergo this main purpose: To actively search for, identify and 
shape these „weak“ and scarcely quantifiable criteria for success or failure. 



3. Operations Research and Decision Support in the Miiitary 

The Military estimate of situation needs not only a task, own and enemy troops, 
terrain and environment, but especially needs to include a sound estimate of the 
options given to act or to react: A high Commander’s decision needs support. 

As described before, the use of scientific resources for a specific innovation 
project as an operational decision support task always is a strategic investment de- 
cision (capital and human capital, budget and time budget) within an enterprise. 
Under certain circumstances it can be crucial for competition, if a specific innova- 
tion process with all steps was invested: crucial for victory or defeat in war. On 
the other hand for sure it’s possible that some of the efforts may prove useless. 

That’s why Decision Support in wartime, and esp. war gaming applications al- 
ready in peace times, are given a high priority whenever Armed Forces are able to 
allocate some additional new resources, and to re-develop their organization. 

To give some examples, recently all Defense Organizations are re-orienteering 
themselves under the evolving changes of threats towards constabulization, opera- 
tions other than war, anti-terrorism, anti-organized-crime and peace support. 

Specifically the anti-terrorism task for the Military needs not only to deal with 
conventional or strategic defense questions as in the years of the Cold War, but to 





544 



engage in the matters of interagency operations. This means to include all national 
resources, not only the military ones, in a coordinated effort of fighting the threat. 

The Austrian Defense in the European context is still maintaining neutrality in 
case of war - but in the much more likely cases of non-war, non-declared war and 
subconventional attack or threat we are to contribute to the overall efforts for con- 
struction of a comprehensive security umbrella to the Austrian population. 

The role of Operations Research in this regard is to give estimates about op- 
tions to act or to react, using the full spectrum of quantitative and qualitative 
methodology which was developed by this scientific discipline in the past. 

Main purpose of Operations Research in the Military in general is to be a 
knowledge base for decisions, innovation and change management, promoting and 
helping decisions about organizational change as described in detail above. 



Literature: 

BERTALANFFY, L: General Systems Theory, New York 1968 
BRASCHE, U: Qualifikation - Engpass im Innovationsprozess ? Die Diffusion 
von Mikroelektronik und die Veranderung der Qualifikationsanforderungen, 
Berlin 1989 

CLAUSEWITZ, On War, Berlin 1832 

GILCHRIST, H.L: A Comparative Study of World War Casualties from Gas and 
Other Weapons, Chemical Warfare School 1928 
JARDINI, D: Out of the Blue Yonder: The Transfer of Systems Thinking from the 
Pentagton to the Great Society, in: HUGHES, T., HUGHES, A. (eds.): Sys- 
tems, Experts, and Computers, MIT 2000 
KEYNES, J.M: The General Theory of Employment, Interest, and Money, 
New York 1936 

KONDRATIEFF, N.D: Die langen Wellen der Konjuktur, in: Archiv fiir Sozial- 
wissenschaft und Sozialpolitik, Tubingen 1926 
LANCHESTER, F.W: Equations of Combat, in: Aircraft in Warfare, London 1916 
MEADOWS, D.H., MEADOWS, D.L. et al.: Limits to Growth, New York 1972 
OTTILIENFELD, F.G: Fordism, Industry and Technical Intelligence, Jena 1926 
SCHUMPETER, J.A: The Analysis of Economic Change, in: Review of Eco- 
nomic Statistics, MIT 1935 

TAYLOR, F.W: Scientific Management, New York 1911 





Von der Pradikatenlogik zur 

unternehmerischen 

Entscheidungsunterstiitzung 



Friedhelm Kulmann und Wilhelm Rodder 

Lehrstuhl fiir Betriebswirtschaftslehre, insb. Operations Research 
FernUniversitat in Hagen 

friedhelm . kulmann® fernuni-hagen . de 



Zusammenfassung Mit der Expertensystemshell SPIRIT steht ein machtiges In- 
strument zur unternehmerischen Entscheidungsunterstiitzung auf der Basis proba- 
bilistischen konditionalen Wissens zur Verfugung. In mehreren Arbeiten - wie zum 
Beispiel [5], [3] - wurde iiber ihren professionellen Praxiseinsatz berichtet. Wegen 
der hohen Komplexitat der Entscheidungsmechanismen bedarf es zumeist intensi- 
ver Gesprache bis zur Akzeptanz der zugrunde liegenden Theorien; letzte Zweifel 
werden jedoch haufig nicht ganz ausgeraumt. Mit diesem Beitrag wird nun fiir die 
Verwendung von SPIRIT ein neues Einsatzgebiet gewahlt und damit ein noch nicht 
da gewesener Zugang geschaffen. Konditionalmodelle sind auch zur Losung pradika- 
tenlogischer Aufgaben geeignet, wie sie beispielsweise in sogenannten Logikratseln 
gestellt werden. Damit wird gleichsam fiir derartige Aufgaben eindrucksvoll nach- 
gewiesen, dass die Fahigkeit des Menschen zur Deduktion von Wissen aus Wissen 
auf dem Computer formalisierbar ist. Nach der verbalen Formulierung einer bei- 
spielhaft ausgewahlten pradikatenlogischen Aufgabe wird diese in der Shell SPI- 
RIT kausal modelliert und fiir verschiedenartige Pragestellungen eindeutig gelost. 
Im letzten Teil des Beitrags wird durch Aufzeigen von Analogien deutlich, dass 
der Weg von »Logeleien<i: zu ernsthaften okonomischen Anwendungen nicht weit 
ist. Mit dem Verst andnis des Mechanismus zur Modellierung und Losung pradi- 
katenlogischer Aufgaben kann so vielleicht der Widerstand gegen die Verwendung 
probabilistischer, mathematisch fundierter Konzepte abgebaut werden. 



1 Einleitung 

Zentraler Bestandteil einer wissensbasierten Modellierung unternehmerischer 
Entscheidungsprobleme ist die Erfassung der Zusammenhange durch Bau- 
steine, die als pradikatenlogische Formeln interpretiert werden konnen. Die 
Pradikatenlogik ist in idealer Weise geeignet, die interne Struktur von Aus- 
sagen zu beriicksichtigen und Beziehungen zwischen Eigenschaften von Ob- 
jekten zum Ausdruck zu bringen (vgl. [1]). Letztlich ist es auch moglich, mit 
ihr Fragen zu formulieren und Antworten aus dem bisher »Gelernten< ab- 
zuleiten. Sie kann damit gleichsam als Vorstufe zur Umsetzung des Entschei- 
dungsproblems in die probabilistische Konditionallogik genutzt werden. 

Leider muss bei Unternehmen, die potentielle Nutzer leistungsstar- 
ker Werkzeuge zur Entscheidungsunterstiitzung sind, einerseits eine gene- 




546 



relle Skepsis hinsichtlich der Anwendbarkeit und andererseits Misstrauen 
beziiglich vorgeschlagener Entscheidungen festgestellt werden. Wir haben 
deshalb in diesem Beitrag nicht den klassischen Zugang iiber die ausfiihrliche 
Darstellung der verwendbaren reichhaltigen Sprache und des eingesetzten In- 
ferenzmechanismus gewahlt, sondern nutzen das jedem vertraute weite Feld 
der Logikratsel, um das System SPIRIT als Instrument der Entscheidungs- 
unterstiitzung vorzustellen (vgl. [6]). 

Im Kapitel 2 erfolgt nach der Pr^entation eines Ratsels dessen Formali- 
sierung in der Notation der Pradikatenlogik. Die Umsetzung als Wissensba- 
sis des probabilistischen Systems SPIRIT wird im Kapitel 3 vorgenommen; 
aufierdem wird der Inferenzmechanismus erlautert, und beispielhaft werden 
einige Anfragen gestellt. Kapitel 4 zeigt - wie eingangs angedeutet - logisch 
gleich strukturierte, unternehmerische Fragestellungen auf, die analog beant- 
wortet werden konnen. 

2 Vom Wortmodell zu logischen Strukturen 

Zur Vorbereitung der Nutzung von Computersystemen bei der Losung von 
Aufgaben aus dem unternehmerischen Umfeld miissen die relevanten Grofien 
und Einflussfaktoren zunachst verbal formuliert werden. Das daraus resultie- 
rende, noch nicht formalisierte sogenannte Wortmodell ist daraufhin in seiner 
kausalen Struktur zu analysieren und in eine Sprache zu iibersetzen, die mit 
ihrem Formalismus diese Abhangigkeiten unmittelbar zum Ausdruck bringt. 

Bei Aufgaben aus dem Bereich der Logik ist das Wortmodell zumeist auf 
ein Minimum reduziert, und man ist zur Losung gestellter Fragen aufgefor- 
dert, die oben genannte Analyse und Umsetzung - zumindest gedanklich - 
vorzunehmen. Besonders schwierig sind diese Punkte dann, wenn das Losen 
der Aufgabe durch fehlende Assoziationsmuster erschwert wird. Als Beispiel 
diene das nun folgende, von Janko, A. und Janko, O. im Internet veroffent- 
lichte Ratsel, das mit »Pprills, Squirde und Glopps« iiberschrieben ist (s. 
[ 2 ]). 

»Alle gebildeten Leute wissen nun, dass Pprills, Squirde und Glopps ein- 
fach nur Formen von Nahfen sind. Es ist aufierdem bewiesen, dass Squirde 
sowohl Glopps als auch Nahfen sind. 

Allerdings gibt es eine Komplikation: Neuerdings fand man heraus, dass es 
Glopps gibt, die weder Squirde, Pprills noch Gdynxe sind. Zusatzlich gibt es 
noch Squirde, die weder Gdynxe noch Pprills sind. 

Zugegebenermafien, manche Pprills und manche Gdynxe sind Glopps, und 
manche Squirde auch. Aber jetzt wissen wir mehr fiber Gdynxe: Manche 
sind Squirde, manche Glopps und manche komischerweise sowohl Pprills als 
auch Squirde. 

a) Gibt es unter den Gdynxen, die keine Nahfen sind, Glopps? 

b) Wenn ein Pprill ein Squird ist, ist es dann auf jeden Fall ein Glopp? [...] 





547 



c) Gibt es in diesem Universum tatsachlich eine Kreatur, die in der Situation 
ist, von sich behaupten zu miissen, sowohl ein Pprill und ein Nahf als 
auch ein Squird und ein Glopp und obendrein auch noch ein Gdynx zu 
sein?« (vgl. [2]) 

Der erste Schritt der Formalisierung besteht in der Beschreibung der ein- 
stelligen Pradikate Nahf, Glopp, Squird, Pprill, Gdynx, die zur Charak- 
terisierung von Objekten verwendet werden. Das vollstandige Ergebnis der 
Umsetzung sind die folgenden Formeln 1 bis 10. 

1. \/x: Glopp(ar) — > Nahf (x) 

2. Vx: Squird(x) — > Nahf (x) 

3. Vx: Pprill(x) — Nahf (x) 

4. Vx: Squird(x) — r (Glopp(x) A Nahf (x)) 

5. 3x: Glopp(x) — > (-iSquird(x) A -iPprill(x) A ~»Gdynx(x)) 

6. 3x: Squird(x) — > (-iGdynx(x) A ->Pprill(x)) 

7. 3x: Pprill(x) — > Glopp(x) 

8. 3x: Gdynx(x) — > Glopp(x) 

9. 3x: Gdynx(x) — > Squird(x) 

10. 3x: Gdynx(x) — > (Pprill(x) A Squird(x)) 

Es handelt sich ausnahmslos um mit dem Allquantor gebundene Impli- 
kationen (1 bis 4) oder um Existenzaussagen (5 bis 10). Im nun folgenden 
Kapitel 3 wird mit der Einbettung in ein System zur probabilistischen Wis- 
sensverarbeitung die Moglichkeit der Beantwortung komplexer Fragestellun- 
gen in diesem Universum skuriler Kreaturen geboten. 

3 Die logische Struktur als Wissensbasis 

Zur Umsetzung in eine probabilistische Wissensbasis werden zunachst die ein- 
stelligen Pradikate als boolsche Variable interpretiert. Nahf, Glopp, Squird, 
Pprill, Gdynx bilden im obigen Beispiel die vollstandige Variablenmenge, de- 
ren positive und negative Auspragungen somit den Ereignisraum i? reprasen- 
tieren. Formeln, die durch die Verkniipfung mittels Konjunktion, Disjunkti- 
on und durch Negation entstehen, sind in natiirlicher Weise mit Ereignissen 
aus i? identifizierbar. Zur Reprasentation der in pradikatenlogischen Formeln 
zum Ausdruck gebrachten Implikation wird das Konditional verwendet. Der 
Allquantor bringt die Allgemeingiiltigkeit einer Formel zum Ausdruck; bei 
der Einbettung in die probabilistische Wissensbasis gilt das entsprechende 
Konditional mit Sicherheit. Durch den Existenzquantor wird eine schwachere 
Forderung fur das Konditional aufgestellt; die Wahrscheinlichkeit muss echt 
positiv sein. Die damit auftretenden Modellierungsfragen werden im spateren 
Beispiel diskutiert. 

Aus den Formeln 1 bis 10 ergeben sich fiir SPIRIT unmittelbar die in der 
Abbildung 3 gezeigten Konditionale. Da im Beispiel nur qualitative Anga- 
ben gemacht werden, wird bei der Umsetzung der Existenzaussagen in die 




548 



SPIRIT-Syntax eine positive (kleine) Wahrscheinlichkeit fiir die Giiltigkeit 
der konditionierten Aussage angenommen. Da der numerische Wert das Er- 
gebnis nicht beeinflusst, wurde hier 0, 01 gewahlt. 



1 

1 




Rule text j 


1 


1.000 


Nahfl Olopp 


1 


1,000 


Nahfl Squird 


1 


1.000 


Nahfl Pprtfl 


1 


1.000 


(OtoppA Nahf) 1 Squird 


0,01 


1^0,01 0 


(-■Squird a -■Pprill a -iGdynx) | Glopp 


0,01 


0.01 0 


(-■Gdynx a ->Ppr!l!) \ Squird 


0,01 


0.010 


Glopp 1 Pprill 


G,01 


0.010 


Glopp 1 Gdynac 


0,01 


0.010 


Squird | Odynx 


0,01 


0,010 


(Pprill A Squird) | (Gdyn)0 



Abbildung 1. Wissensbasis zum Logikratsel »Pprills, Squirde und Glopps< 



SPIRIT ermoglicht auf der Basis gesicherter informationstheoretischer Er- 
kenntnisse die Generierung einer Wissensbasis, die alle geforderten Kausa- 
litaten mit den angegebenen Wahrscheinlichkeiten reprasentiert und dariiber 
hinaus keine nicht intendierten Abhangigkeiten zwischen Variablen anlegt. 
Mathematisch geschieht dieser informationstreue Wissensaufbau durch die 
Wahrung des Prinzips maximaler Entropie (siehe [6]). Abbildung 3 zeigt den 
im System angezeigten sogenannten (ungerichteten) Abhangigkeitsgraphen 
mit den Variablen als Knoten, die als zusatzliche Angabe die jeweilige Rand- 
verteilung ausweisen (zum theoretischen Hintergrund siehe auch [4]). 














■ 















HuiQ 

mmm 






wmm 



Abbildung 2. Abhangigkeitsgraph zum Logikratsel 







549 



Bisher unbeantwortet geblieben sind die in Kapitel 2 gestellten Pragen. 
Wahrscheinlichkeitstheoretisch formuliert lautet beispielsweise Teil a): »Wie 
grofi ist die Wahrscheinlichkeit fiir ein Glopp, unter der Bedingung, dass es 
Gdynx und kein Nahf ist?« Als Anfragekonditional fiir die Wissensbasis ist 
die formale Schreibweise: [?] Gloppj (Gdynx A iNahf ) (siehe auch Abbildung 
3). Unter Verwendung des in SPIRIT implementierten Inferenzmechanismus 
konnen die im Abschnitt 2 unter a), b), c) gestellten Fragen nun beantwortet 
werden (vgl. hierzu [6], [7]). 



Pact 1 


' Rule feit . 1 


Li.OOG 


Glopp 1 (Gdynx A -tNahf) 


i;aoD 


(Glopp 1 Pprill) \ (Squird | Pprill) 


0.004 


Pprill A Nahf A Squird A Glopp A Gdynx 



Abbildung 3. Anfragen zum Logikratsel 



a) Es gibt unter den Gdynxen, die keine Nahfen sind, keinen Glopp. 

b) Wenn ein Pprill ein Squird ist, dann ist es auf jeden Fall ein Glopp. 

c) Es ist zwar unwahrscheinlich, aber es gibt in diesem Universum 
tatsachlich eine Kreatur, die in der Situation ist, von sich behaupten 
zu miissen, sowohl ein Pprill und ein Nahf als auch ein Squird und ein 
Glopp und obendrein auch noch ein Gdynx zu sein. 

Nach Behandlung der logisch abstrakt gestellten Aufgabe bleibt noch die 
Frage zu beantworten, welcher Realitatsbezug dabei existieren konnte. 

4 Das unternehmerische Entscheidungsproblem 

Das Management eines Unternehmens steht - so ist jedenfalls zu hoffen - 
nur in den seltensten Fallen vor einem Ratsel; dennoch sind die logischen 
Problemstrukturen oft ahnlich. Deshalb soil das obige Beispiel in die Ter- 
minologie des Marketings iibertragen und als Analyse des Kundenstruktur 
einer Supermarktkette zwecks Durchfiihrung zielgerichteter Werbemafinah- 
men interpretiert werden. Die logische Aquivalenz der Aussagen lasst sich 
beim direkten Vergleich mit den urspriinglichen Formulierungen im Kapitel 
2 unmittelbar nachvollziehen. 

Laut Umfrage gibt es u.a. drei typische Zielgruppen von Kundinnen, 
namlich Mutter, Halbtagsbeschaftigte und Hausfrauen. Die halbtags Beschaf- 
tigten unter den Kundinnen pflegen aufgrund ihrer besonderen Situation auch 
den Haushalt zu fiihren. Andererseits gibt es durchaus Hausfrauen, die we- 
der halbtags beschaftigt sind, noch ein Kind haben, noch alleinerziehend sind. 
Weiterhin sind gewisse halbtags Beschaftigte weder alleinerziehend noch ha- 
ben sie ein Kind. Mutter und Alleinerziehende konnen ebenso Hausfrauen sein 






550 



wie manche Halbtagsbeschaftigte auch. 1st eine Kundin Alleinerziehende, so 
kann sie durchaus halbtags beschaftigt oder Hausfrau oder sogar halbtags- 
beschaftigte Mutter sein, 

Diese etwas unstrukturierten Gruppenabhangigkeiten sind durch offene 
Interviews ermittelt worden und konnen als gesichert gelten. Erstaunlicher- 
weise lassen sich aus den unvollst^digen Angaben wichtige Folgerungen fiber 
die Gesamtstruktur der Kundinnen im Hinblick auf Werbemafinahmen zie- 
hen. Die Aussagen ergeben sich unmittelbar aus der Ubertragung der zum 
Ende des letzten Kapitels gegebenen Antworten. 

a) Die Neukundenwerbung fiir die Zielgruppe »Alleinerziehende< erreicht 
keinesfalls auch »Hausfrauen«. 

b) Mutter, die halbtags beschaftigt sind, werden automatisch auch iiber die 
Werbebotschaft fiir Hausfrauen angesprochen. 

c) Es gibt durchaus eine Gruppe von Kundinnen - wenn auch eventuell eine 
kleine -, die auf jede zielgruppengerichtete Werbekampagne reagiert. 

5 Fazit und Ausblick 

Mit dem Einstieg in die komplexe Thematik der probabilistischen Wissens- 
verarbeitung und der Modellierung konditionaler Strukturen fiber eine als 
Logikratsel gestellte Aufgabe wurde eine neuartige Sichtweise auch fur die 
Losung unternehmerischer Fragestellungen eroffnet. Bereits das vorgestellte 
einfache Beispiel hat eindrucksvoll die Notwendigkeit der Erfassung kom- 
plexer logischer Abhangigkeiten deutlich gemacht. Aufierdem wurde gezeigt, 
dass wissensbasierte Systeme nicht nur in der Lage sondern auch dringend 
erforderlich sind, um geeignete Strukturierungshilfe bei Entscheidungsproble- 
men zu bieten. 

Literatur 

1. Genesereth, M.R.; Nilsson, N.J. (1989): Logische Grundlagen der Kfinstlichen 
Intelligenz, Vieweg, Braunschweig, 1989. 

2. Janko, A.; Janko, O. (2002): Ratsel und Denksport. 
http://www.janko.at/Raetsel/Logik/003.a.htm (Stand August 2002). 

3. Kulmann, F. (2002): Wissen und Information in konditionalen Modellen - 
zur Entscheidungsvorbereitung im Anfrage- und Auftragsmanagement, DUV 
Gabler, Wiesbaden, 2002. 

4. Meyer, C.-H. (1998): Korrektes Schliefien bei unvollstandiger Information, Eu- 
ropdische Hochschulschriften, Peter Lang, Frankfurt a.M., 1998. 

5. Kulmann, F.; Reucher, E. (2000): Computergestfitzte Bonitatsprfifung bei Ban- 
ken und Handel, DBW - Die Betriebswirtschaft, 60 (2000) 118-122. 

6. Rodder, W. (2000): Conditional Logic and the Principle of Entropy, Artificial 
Intelligence, 117 (2000) 83-106. 

7. Rodder, W. (2001): Knowledge Processing under Information Fidelity, Proc. IJ- 
CAI 2001 - Seventeenth International Joint Conference on Artificial Intelligence, 
Seattle, Washington (2001) 7^9-754. 





